This is what I have added in my code:
rdd.persist(StorageLevel.MEMORY_ONLY_SER())
conf.set("spark.serializer","org.apache.spark.serializer.KryoSerializer");
Do I compulsorily need to do anything via : spark.kryo.classesToRegister?
Or the above code sufficient to achieve performance gain
Hi, does someone has experience/knowledge on using
JavaPairRDD.treeAggregate?
Even sample code will be helpful.
Not many articles etc. available on web.
Thanks
Amit
..
> ;
>
> Natu
>
>
> On Sat, Oct 31, 2015 at 11:18 PM, ayan guha <guha.a...@gmail.com> wrote:
>
>> My java knowledge is limited, but you may try with a hashmap and put RDDs
>> in it?
>>
>> On Sun, Nov 1, 2015 at 4:34 AM, amit tewari <amittewar...@
Hi
I need the ability to be able to create RDDs programatically inside my
program (e.g. based on varaible number of input files).
Can this be done?
I need this as I want to run the following statement inside an iteration:
JavaRDD rdd1 = jsc.textFile("/file1.txt");
Thanks
Amit
om> wrote:
>
>> Yes, this can be done. quick python equivalent:
>>
>> # In Driver
>> fileList=["/file1.txt","/file2.txt"]
>> rdd = []
>> for f in fileList:
>> rdd = jsc.textFile(f)
>> rdds.append(rdd)
>>
>&
Hi
I am struggling to find how to run a scala script on Datastax Spark.
(SPARK_HOME/bin/spark-shell -i test.scala is depricated)
I dont want to use the scala prompt.
Thanks
AT
at 1:54 PM, amit tewari amittewar...@gmail.com
wrote:
Actually the question was will keyBy() take accept multiple fields (eg
x(0), x(1)) as Key?
On Tue, Jun 9, 2015 at 1:07 PM, amit tewari amittewar...@gmail.com
wrote:
Thanks Akhil, as you suggested, I have to go keyBy(route) as need
Actually the question was will keyBy() take accept multiple fields (eg
x(0), x(1)) as Key?
On Tue, Jun 9, 2015 at 1:07 PM, amit tewari amittewar...@gmail.com wrote:
Thanks Akhil, as you suggested, I have to go keyBy(route) as need the
columns intact.
But wil keyBy() take accept multiple
basically requires RDD[K,V] and in your case its ((String,
String), String, String). You can also look in keyBy if you don't want to
concatenate your keys.
Thanks
Best Regards
On Tue, Jun 9, 2015 at 10:14 AM, amit tewari amittewar...@gmail.com
wrote:
Hi Dear Spark Users
I am very new to Spark
Hi Dear Spark Users
I am very new to Spark/Scala.
Am using Datastax (4.7/Spark 1.2.1) and struggling with following
error/issue.
Already tried options like import org.apache.spark.SparkContext._ or
explicit import org.apache.spark.SparkContext.rddToPairRDDFunctions.
But error not resolved.
, amit tewari amittewar...@gmail.com
wrote:
Hi Dear Spark Users
I am very new to Spark/Scala.
Am using Datastax (4.7/Spark 1.2.1) and struggling with following
error/issue.
Already tried options like import org.apache.spark.SparkContext._ or
explicit import
I believe you got to set following
SPARK_HADOOP_VERSION=2.2.0 (or whatever your version is)
SPARK_YARN=true
then type sbt/sbt assembly
If you are using Maven to compile
mvn -Pyarn -Dhadoop.version=2.2.0 -Dyarn.version=2.2.0 -DskipTests clean
package
Hope this helps
-A
On Fri, Apr 4, 2014
12 matches
Mail list logo