Number of sortBy output partitions

2016-07-21 Thread Simone Franzini
not as this is probably due to the way that sortBy is implemented, but I thought I would ask anyway. Should it matter, I am running Spark 1.4.2 (DataStax Enterprise). Thanks, Simone Franzini, PhD http://www.linkedin.com/in/simonefranzini

Spark on DSE Cassandra with multiple data centers

2016-05-11 Thread Simone Franzini
() That is, it appears that the in the hadoop command is being ignored and it is trying to connect to cfs: rather than additional_cfs. Anybody else ran into this? Simone Franzini, PhD http://www.linkedin.com/in/simonefranzini

Re: Spark standalone workers, executors and JVMs

2016-05-04 Thread Simone Franzini
actually using and how do I set those? As far as I understand the worker does not need many resources, as it is only spawning up executors. Is that correct? Thanks, Simone Franzini, PhD http://www.linkedin.com/in/simonefranzini On Mon, May 2, 2016 at 7:47 PM, Mohammed Guller <moham...@glassbeam.

Fwd: Spark standalone workers, executors and JVMs

2016-05-02 Thread Simone Franzini
issues (GC and such). As of Spark 1.4 it is possible to either deploy multiple workers (SPARK_WORKER_INSTANCES + SPARK_WORKER_CORES) or multiple executors per worker (--executor-cores). Which option is preferable and why? Thanks, Simone Franzini, PhD http://www.linkedin.com/in/simonefranzini

updateStateByKey: cleaning up state for keys not in current window

2015-01-09 Thread Simone Franzini
in this case? Or, in other words, how can I clear the state for a key when Seq[V] is empty? Simone Franzini, PhD http://www.linkedin.com/in/simonefranzini

Re: NullPointerException When Reading Avro Sequence Files

2014-12-15 Thread Simone Franzini
To me this looks like an internal error to the REPL. I am not sure what is causing that. Personally I never use the REPL, can you try typing up your program and running it from an IDE or spark-submit and see if you still get the same error? Simone Franzini, PhD http://www.linkedin.com

Re: NullPointerException When Reading Avro Sequence Files

2014-12-09 Thread Simone Franzini
here: http://apache-spark-user-list.1001560.n3.nabble.com/How-can-I-read-this-avro-file-using-spark-amp-scala-td19400.html#a19491 Maybe there is a simpler solution to your problem but I am not that much of an expert yet. I hope this helps. Simone Franzini, PhD http://www.linkedin.com

Re: NullPointerException When Reading Avro Sequence Files

2014-12-09 Thread Simone Franzini
You can use this Maven dependency: dependency groupIdcom.twitter/groupId artifactIdchill-avro/artifactId version0.4.0/version /dependency Simone Franzini, PhD http://www.linkedin.com/in/simonefranzini On Tue, Dec 9, 2014 at 9:53 AM, Cristovao Jose Domingues Cordeiro

Re: Kryo NPE with Array

2014-12-02 Thread Simone Franzini
is registered through the Chill AllScalaRegistrar which is called by the Spark Kryo serializer. I thought I'd document this in case somebody else is running into a similar issue. Simone Franzini, PhD http://www.linkedin.com/in/simonefranzini On Wed, Nov 26, 2014 at 7:40 PM, Simone Franzini

Re: Spark SQL 1.0.0 - RDD from snappy compress avro file

2014-11-29 Thread Simone Franzini
Did you have a look at my reply in this thread? http://apache-spark-user-list.1001560.n3.nabble.com/How-can-I-read-this-avro-file-using-spark-amp-scala-td19400.html I am using 1.1.0 though, so not sure if that code would work entirely with 1.0.0, but you can try. Simone Franzini, PhD http

Re: Kryo NPE with Array

2014-11-26 Thread Simone Franzini
new to Scala and I can't see how I would do this. In the worst case, could I override the newKryo method and put my configuration there? It appears to me that method is the one where the kryo instance is created. Simone Franzini, PhD http://www.linkedin.com/in/simonefranzini On Tue, Nov 25, 2014

Kryo NPE with Array

2014-11-25 Thread Simone Franzini
(kryo: Kryo) { kryo.register(...) } } Simone Franzini, PhD http://www.linkedin.com/in/simonefranzini

Re: How can I read this avro file using spark scala?

2014-11-21 Thread Simone Franzini
for that: GenericRecordSerializer kryo.register(classOf[MyAvroClass], AvroSerializer.SpecificRecordBinarySerializer[MyAvroClass]) } } Simone Franzini, PhD http://www.linkedin.com/in/simonefranzini On Fri, Nov 21, 2014 at 7:04 AM, thomas j beanb...@googlemail.com wrote: I've been able to load

Reading nested JSON data with Spark SQL

2014-11-19 Thread Simone Franzini
to scala.collection.immutable.Map How can I read such a field? Am I just missing something small or should I be looking for a completely different alternative to reading JSON? Simone Franzini, PhD http://www.linkedin.com/in/simonefranzini

Re: Reading nested JSON data with Spark SQL

2014-11-19 Thread Simone Franzini
This works great, thank you! Simone Franzini, PhD http://www.linkedin.com/in/simonefranzini On Wed, Nov 19, 2014 at 3:40 PM, Michael Armbrust mich...@databricks.com wrote: You can extract the nested fields in sql: SELECT field.nestedField ... If you don't do that then nested fields

Declaring multiple RDDs and efficiency concerns

2014-11-14 Thread Simone Franzini
an efficiency issue or just a stylistic one? Simone Franzini, PhD http://www.linkedin.com/in/simonefranzini

Accessing RDD within another RDD map

2014-11-13 Thread Simone Franzini
inside the map statement. I am failing to understand what I am doing wrong. Can anyone help with this? Thanks, Simone Franzini, PhD http://www.linkedin.com/in/simonefranzini

Re: AVRO specific records

2014-11-07 Thread Simone Franzini
is that this writes to a plain text file. I need to write to binary AVRO. What am I missing? Simone Franzini, PhD http://www.linkedin.com/in/simonefranzini On Thu, Nov 6, 2014 at 3:15 PM, Simone Franzini captainfr...@gmail.com wrote: Benjamin, Thanks for the snippet. I have tried using it, but unfortunately I

Re: AVRO specific records

2014-11-06 Thread Simone Franzini
) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Simone Franzini, PhD http://www.linkedin.com/in/simonefranzini On Wed, Nov 5, 2014 at 4:24 PM, Laird, Benjamin benjamin.la...@capitalone.com wrote: Something like this works and is how I

AVRO specific records

2014-11-05 Thread Simone Franzini
How can I read/write AVRO specific records? I found several snippets using generic records, but nothing with specific records so far. Thanks, Simone Franzini, PhD http://www.linkedin.com/in/simonefranzini