Finally it works.
@Sean, I'm trying to setup env in IDE so I can track into to Spark -- that
will help me understand Spark internal mechanism.
@Ted, thanks. I'm using Maven, not SBT, but thanks for the suggestion
anyway.
For others who might interested in:
I choose bigtop-dist profile so under
It is true that you can persist SchemaRdds / DataFrames to disk via
Parquet, but a lot of time and inefficiencies is lost. The in-memory
columnar cached representation is completely different from the
Parquet file format, and I believe there has to be a translation into
a Row (because ultimately
How do you mean you run LogQuery? you would run these using the
run-example script rather than in IntelliJ.
On Sun, Feb 1, 2015 at 4:01 AM, Yafeng Guo daniel.yafeng@gmail.com wrote:
Hi,
I'm setting up a dev environment with Intellij IDEA 14. I selected profile
scala-2.10, maven-3, hadoop
I have been working a lot recently with denormalised tables with lots of
columns, nearly 600. We are using this form to avoid joins.
I have tried to use cache table with this data, but it proves too expensive
as it seems to try to cache all the data in the table.
For data sets such as the one I
I've added support for sparse vectors and created HadamardTF for the
pipeline, please take a look on my branch
https://github.com/ogeagla/spark/compare/spark-mllib-weighting .
Thanks!
--
View this message in context:
Its not completely transparent, but you can do something like the following
today:
CACHE TABLE hotData AS SELECT columns, I, care, about FROM fullTable
On Sun, Feb 1, 2015 at 3:03 AM, Mick Davies michael.belldav...@gmail.com
wrote:
I have been working a lot recently with denormalised tables
For the specific question of supplementing Standalone Mode with a custom
leader election protocol, this was actually already committed in master and
will be available in Spark 1.3:
https://github.com/apache/spark/pull/771/files
You can specify spark.deploy.recoveryMode = CUSTOM
and
1. Is IndexedRDD planned for 1.3?
https://issues.apache.org/jira/browse/SPARK-2365
2. Once IndexedRDD is in, is it planned to convert Word2VecModel to it from its
current Map[String,Array[Float]]?
Hi guys,
That's great to hear that this is available in Spark 1.3! .. I will play
around with this feature and let you know the results for integrating
Hazelcast. Also, may I know the tentative release date for Spark 1.3? ..
Cheers,
Anjana.
On Mon, Feb 2, 2015 at 3:07 AM, Aaron Davidson