Hi,
I have a Kafka topic that contains dozens of different types of messages.
And for each one I'll need to create a DStream for it.
Currently I have to filter the Kafka stream over and over, which is very
inefficient.
So what's the best way to do dispatching in Spark Streaming? (one DStream
-
Hi guys,
I was trying to figure out some counters in Spark, related to the amount of CPU
or Memory used (in some metric), used by a task/stage/job, but I could not find
any.
Is there any such counter available ?
Thank you,Robert
Cheng, this is great info. I have a follow up question. There are a few
very common data types (i.e. Joda DateTime) that is not directly supported
by SparkSQL. Do you know if there are any plans for accommodating some
common data types in SparkSQL? They don't need to be a first class
datatype, but
Hi all.
The code is linked from my repo:
https://github.com/ehiggs/spark-terasort
This is an example Spark program for running TeraSort benchmarks. It is
based on work from Reynold Xin's branch
https://github.com/rxin/spark/tree/terasort, but it is not the same
TeraSort program that
Actually, I did a little investigation on joda time when I was working on
SPARK-4987 for Timestamp ser-de in parquet format. I think Joda offers
interface to get java object from joda time object natively.
For example, to transform a java.util.Date (parent of java.sql.Date and
These common UDTs can always be wrapped in libraries and published to
spark-packages http://spark-packages.org/ :-)
Cheng
On 4/12/15 3:00 PM, Justin Yip wrote:
Cheng, this is great info. I have a follow up question. There are a
few very common data types (i.e. Joda DateTime) that is not
I have to create some kind of index from my JavaRDDObject it should be
something like javaPairRDDuniqueindex, Object
but zipWith Index giving Object, Long later I need to use this RDD for
join so its looks it wont work for me.
On 9 April 2015 at 04:17, Ted Yu yuzhih...@gmail.com wrote:
Please
bq. will return something like JavaPairRDDObject, long
The long component of the pair fits your description of index. What other
requirement does ZipWithIndex not provide you ?
Cheers
On Sun, Apr 12, 2015 at 1:16 PM, Jeetendra Gangele gangele...@gmail.com
wrote:
Hi All I have an RDD
Hi All I have an RDD JavaRDDObject and I want to convert it to
JavaPairRDDIndex,Object.. Index should be unique and it should maintain
the order. For first object It should have 1 and then for second 2 like
that.
I tried using ZipWithIndex but it will return something like
JavaPairRDDObject, long