How can I read/write AVRO specific records?
I found several snippets using generic records, but nothing with specific
records so far.
Thanks,
Simone Franzini, PhD
http://www.linkedin.com/in/simonefranzini
)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Simone Franzini, PhD
http://www.linkedin.com/in/simonefranzini
On Wed, Nov 5, 2014 at 4:24 PM, Laird, Benjamin
benjamin.la...@capitalone.com wrote:
Something like this works and is how I
is that this writes to a plain text file. I need to write
to binary AVRO. What am I missing?
Simone Franzini, PhD
http://www.linkedin.com/in/simonefranzini
On Thu, Nov 6, 2014 at 3:15 PM, Simone Franzini captainfr...@gmail.com
wrote:
Benjamin,
Thanks for the snippet. I have tried using it, but unfortunately I
inside the map
statement. I am failing to understand what I am doing wrong.
Can anyone help with this?
Thanks,
Simone Franzini, PhD
http://www.linkedin.com/in/simonefranzini
an efficiency issue or just a stylistic one?
Simone Franzini, PhD
http://www.linkedin.com/in/simonefranzini
to
scala.collection.immutable.Map
How can I read such a field? Am I just missing something small or should I
be looking for a completely different alternative to reading JSON?
Simone Franzini, PhD
http://www.linkedin.com/in/simonefranzini
This works great, thank you!
Simone Franzini, PhD
http://www.linkedin.com/in/simonefranzini
On Wed, Nov 19, 2014 at 3:40 PM, Michael Armbrust mich...@databricks.com
wrote:
You can extract the nested fields in sql: SELECT field.nestedField ...
If you don't do that then nested fields
for that:
GenericRecordSerializer
kryo.register(classOf[MyAvroClass],
AvroSerializer.SpecificRecordBinarySerializer[MyAvroClass])
}
}
Simone Franzini, PhD
http://www.linkedin.com/in/simonefranzini
On Fri, Nov 21, 2014 at 7:04 AM, thomas j beanb...@googlemail.com wrote:
I've been able to load
(kryo: Kryo) {
kryo.register(...)
}
}
Simone Franzini, PhD
http://www.linkedin.com/in/simonefranzini
new to Scala and I can't
see how I would do this. In the worst case, could I override the newKryo
method and put my configuration there? It appears to me that method is the
one where the kryo instance is created.
Simone Franzini, PhD
http://www.linkedin.com/in/simonefranzini
On Tue, Nov 25, 2014
Did you have a look at my reply in this thread?
http://apache-spark-user-list.1001560.n3.nabble.com/How-can-I-read-this-avro-file-using-spark-amp-scala-td19400.html
I am using 1.1.0 though, so not sure if that code would work entirely with
1.0.0, but you can try.
Simone Franzini, PhD
http
is registered
through the Chill AllScalaRegistrar which is called by the Spark Kryo
serializer.
I thought I'd document this in case somebody else is running into a similar
issue.
Simone Franzini, PhD
http://www.linkedin.com/in/simonefranzini
On Wed, Nov 26, 2014 at 7:40 PM, Simone Franzini
here:
http://apache-spark-user-list.1001560.n3.nabble.com/How-can-I-read-this-avro-file-using-spark-amp-scala-td19400.html#a19491
Maybe there is a simpler solution to your problem but I am not that much of
an expert yet. I hope this helps.
Simone Franzini, PhD
http://www.linkedin.com
You can use this Maven dependency:
dependency
groupIdcom.twitter/groupId
artifactIdchill-avro/artifactId
version0.4.0/version
/dependency
Simone Franzini, PhD
http://www.linkedin.com/in/simonefranzini
On Tue, Dec 9, 2014 at 9:53 AM, Cristovao Jose Domingues Cordeiro
To me this looks like an internal error to the REPL. I am not sure what is
causing that.
Personally I never use the REPL, can you try typing up your program and
running it from an IDE or spark-submit and see if you still get the same
error?
Simone Franzini, PhD
http://www.linkedin.com
in this case? Or, in other
words, how can I clear the state for a key when Seq[V] is empty?
Simone Franzini, PhD
http://www.linkedin.com/in/simonefranzini
()
That is, it appears that the in the hadoop command is
being ignored and it is trying to connect to cfs: rather than
additional_cfs.
Anybody else ran into this?
Simone Franzini, PhD
http://www.linkedin.com/in/simonefranzini
issues (GC and such).
As of Spark 1.4 it is possible to either deploy multiple workers
(SPARK_WORKER_INSTANCES + SPARK_WORKER_CORES) or multiple executors per
worker (--executor-cores). Which option is preferable and why?
Thanks,
Simone Franzini, PhD
http://www.linkedin.com/in/simonefranzini
actually
using and how do I set those? As far as I understand the worker does not
need many resources, as it is only spawning up executors. Is that correct?
Thanks,
Simone Franzini, PhD
http://www.linkedin.com/in/simonefranzini
On Mon, May 2, 2016 at 7:47 PM, Mohammed Guller <moham...@glassbeam.
not as this is probably due to the way that sortBy is
implemented, but I thought I would ask anyway.
Should it matter, I am running Spark 1.4.2 (DataStax Enterprise).
Thanks,
Simone Franzini, PhD
http://www.linkedin.com/in/simonefranzini
20 matches
Mail list logo