Re: One click to run Spark on Kubernetes

2022-02-23 Thread Sarath Annareddy
Hi bo How do we start? Is there a plan? Onboarding, Arch/design diagram, tasks lined up etc Thanks Sarath Sent from my iPhone > On Feb 23, 2022, at 10:27 AM, bo yang wrote: > >  > Hi Sarath, thanks for your interest and willing to contribute! The project > supports lo

Re: One click to run Spark on Kubernetes

2022-02-23 Thread Sarath Annareddy
Hi bo I am interested to contribute. But I don’t have free access to any cloud provider. Not sure how I can get free access. I know Google, aws, azure only provides temp free access, it may not be sufficient. Guidance is appreciated. Sarath Sent from my iPhone > On Feb 23, 2022, at 2

Unsubscribe

2016-08-15 Thread Sarath Chandra

Issue with wholeTextFiles

2016-03-21 Thread Sarath Chandra
I'm using Hadoop 1.0.4 and Spark 1.2.0. I'm facing a strange issue. I have a requirement to read a small file from HDFS and all it's content has to be read at one shot. So I'm using spark context's wholeTextFiles API passing the HDFS URL for the file. When I try this from a spark shell it's

Assign unique link ID

2015-10-31 Thread Sarath Chandra
INK_ID", linkIDUDF(src_join("S1.RECORD_ID"),src("S2.RECORD_ID")));* Then in further lines I'm not able to refer to "s1" columns from "src_link" like - *var src_link_s1 = src_link.as <http://src_link.as>("SL").select($"S1.RECORD_ID");* Please guide me. Regards, Sarath.

Re: Assign unique link ID

2015-10-31 Thread Sarath Chandra
and their types. Any ideas how to tackle this? Regards, Sarath. On Sat, Oct 31, 2015 at 4:37 PM, ayan guha <guha.a...@gmail.com> wrote: > Can this be a solution? > > 1. Write a function which will take a string and convert to md5 hash > 2. From your base table, generate a string

PermGen Space Error

2015-07-29 Thread Sarath Chandra
in some posts and blogs I tried using the option spark.driver.extraJavaOptions to increase the size, tried with 256 and 512 but still no luck. Please help me in resolving the space issue Thanks Regards, Sarath.

Re: PermGen Space Error

2015-07-29 Thread Sarath Chandra
laptop having 4 CPUs and 12GB RAM. On Wed, Jul 29, 2015 at 2:49 PM, fightf...@163.com fightf...@163.com wrote: Hi, Sarath Did you try to use and increase spark.excecutor.extraJaveOptions -XX:PermSize= -XX:MaxPermSize= -- fightf...@163.com *From:* Sarath Chandra

Re: PermGen Space Error

2015-07-29 Thread Sarath Chandra
with this option to rule out a config problem. On Wed, Jul 29, 2015 at 10:45 AM, Sarath Chandra sarathchandra.jos...@algofusiontech.com wrote: Yes. As mentioned in my mail at the end, I tried with both 256 and 512 options. But the issue persists. I'm giving following parameters to spark

MLLib SVMWithSGD is failing for large dataset

2015-04-28 Thread sarath
I am trying to train a large dataset consisting of 8 million data points and 20 million features using SVMWithSGD. But it is failing after running for some time. I tried increasing num-partitions, driver-memory, executor-memory, driver-max-resultSize. Also I tried by reducing the size of dataset

MLLib SVMWithSGD : java.lang.OutOfMemoryError: Java heap space

2015-04-16 Thread sarath
Hi, I'm trying to train an SVM on KDD2010 dataset (available from libsvm). But I'm getting java.lang.OutOfMemoryError: Java heap space error. The dataset is really sparse and have around 8 million data points and 20 million features. I'm using a cluster of 8 nodes (each with 8 cores and 64G RAM).

Re: Unable to submit spark job to mesos cluster

2015-03-04 Thread Sarath Chandra
$.startServiceOnPort(Utils.scala:1450)* * at org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:56)* * at org.apache.spark.SparkEnv$.create(SparkEnv.scala:156)* * at org.apache.spark.SparkContext.init(SparkContext.scala:203)* * at Test.main(Test.java:7)* Regards, Sarath. Thanks Regards

Unable to submit spark job to mesos cluster

2015-03-04 Thread Sarath Chandra
) at com.algofusion.reconciliation.execution.utils.ExecutionUtils.clinit(ExecutionUtils.java:130) ... 2 more Regards, Sarath.

Parallel spark jobs on mesos cluster

2014-09-30 Thread Sarath Chandra
(spark.executor.memory, 3g)* * .set(spark.cores.max, 4)* * .set(spark.task.cpus,4)* * .set(spark.executor.uri, hdfs://sarath:54310/user/hduser/spark-1.0.1-bin-hadoop1.tgz);* *logger.log(Level.INFO, Getting spark context...);* *var sc = new SparkContext(conf);* * sc.addJar(/home/sarath/Projects

Parallel spark jobs on standalone cluster

2014-09-25 Thread Sarath Chandra
. Also in the spark job submission program I'm calling SparkContext.stop at the end of execution. Some times all jobs fail with status as Exited. Please let me know what is going wrong and how to overcome the issue? ~Sarath

Saving RDD with array of strings

2014-09-21 Thread Sarath Chandra
)); newLines.saveAsTextFile(hdfsPath); ... ... def myfunc(line: String):Array[String] = { line.split(;); } Thanks, ~Sarath.

Re: Task not serializable

2014-09-10 Thread Sarath Chandra
Thanks Sean. Please find attached my code. Let me know your suggestions/ideas. Regards, *Sarath* On Wed, Sep 10, 2014 at 8:05 PM, Sean Owen so...@cloudera.com wrote: You mention that you are creating a UserGroupInformation inside your function, but something is still serializing it. You

Re: Task not serializable

2014-09-06 Thread Sarath Chandra
and written it's contents as anonymous function inside map function. This time the execution succeeded. I understood the explanation of Sean. But request for references to a more detailed explanation and examples for writing efficient spark programs avoiding such pitfalls. ~Sarath On 06-Sep-2014 4:32 pm

Task not serializable

2014-09-05 Thread Sarath Chandra
. How to overcome these exceptions? ~Sarath.

Re: Task not serializable

2014-09-05 Thread Sarath Chandra
Hi Akhil, I've done this for the classes which are in my scope. But what to do with classes that are out of my scope? For example org.apache.hadoop.io.Text Also I'm using several 3rd part libraries like jeval. ~Sarath On Fri, Sep 5, 2014 at 7:40 PM, Akhil Das ak...@sigmoidanalytics.com wrote

Re: Simple record matching using Spark SQL

2014-07-17 Thread Sarath Chandra
Added below 2 lines just before the sql query line - *...* *file1_schema.count;* *file2_schema.count;* *...* and it started working. But I couldn't get the reason. Can someone please explain me? What was happening earlier and what is happening with addition of these 2 lines? ~Sarath On Thu

Simple record matching using Spark SQL

2014-07-16 Thread Sarath Chandra
? ~Sarath

Re: Simple record matching using Spark SQL

2014-07-16 Thread Sarath Chandra
are info messages. What else do I need check? ~Sarath On Wed, Jul 16, 2014 at 7:23 PM, Soumya Simanta soumya.sima...@gmail.com wrote: Check your executor logs for the output or if your data is not big collect it in the driver and print it. On Jul 16, 2014, at 9:21 AM, Sarath Chandra

Re: Simple record matching using Spark SQL

2014-07-16 Thread Sarath Chandra
Yes it is appearing on the Spark UI, and remains there with state as RUNNING till I press Ctrl+C in the terminal to kill the execution. Barring the statements to create the spark context, if I copy paste the lines of my code in spark shell, runs perfectly giving the desired output. ~Sarath

Re: Simple record matching using Spark SQL

2014-07-16 Thread Sarath Chandra
at 7:59 PM, Soumya Simanta soumya.sima...@gmail.com wrote: Can you try submitting a very simple job to the cluster. On Jul 16, 2014, at 10:25 AM, Sarath Chandra sarathchandra.jos...@algofusiontech.com wrote: Yes it is appearing on the Spark UI, and remains there with state as RUNNING till

Re: Simple record matching using Spark SQL

2014-07-16 Thread Sarath Chandra
CLASSPATH=$HADOOP_PREFIX/conf:$SPARK_HOME/lib/*:test1-0.1.jar* *export CONFIG_OPTS=-Dspark.jars=test1-0.1.jar* *java -cp $CLASSPATH $CONFIG_OPTS test.Test4 spark://master:7077 /usr/local/spark-1.0.1-bin-hadoop1 hdfs://master:54310/user/hduser/file1.csv hdfs://master:54310/user/hduser/file2.csv* ~Sarath