Contextual bandits

2018-03-09 Thread ey-chih chow
Hi, Does Spark MLLIB support Contextual Bandit? How can we use Spark MLLIB to implement Contextual Bandit? Thanks. Best regards, Ey-Chih -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe

corresponding sql for query against LocalRelation

2016-01-27 Thread ey-chih chow
Hi, For a query against the LocalRelation, is there anybody know what does the corresponding SQL looks like? Thanks. Best regards, Ey-Chih Chow -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/corresponding-sql-for-query-against-LocalRelation-tp26093

IncompatibleClassChangeError

2015-03-05 Thread ey-chih chow
. What else I should do to fix the problem? Ey-Chih Chow -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/IncompatibleClassChangeError-tp21934.html Sent from the Apache Spark User List mailing list archive at Nabble.com

how to improve performance of spark job with large input to executor?

2015-02-27 Thread ey-chih chow
to have better performance for large input data? Thanks. Ey-Chih Chow -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/how-to-improve-performance-of-spark-job-with-large-input-to-executor-tp21856.html Sent from the Apache Spark User List mailing list archive

RE: no space left at worker node

2015-02-09 Thread ey-chih chow
back with the following. What's wrong with this? Ey-Chih Chow === Date: Sun, 8 Feb 2015 22:27:17 -0800Sending launch command to spark://ec2-54-213-73-150.us-west-2.compute.amazonaws.com:7077Driver successfully submitted as driver-20150209185453-0010... waiting before polling master

RE: no space left at worker node

2015-02-09 Thread ey-chih chow
Thanks. But, in spark-submit, I specified the jar file in the form of local:/spark-etl-0.0.1-SNAPSHOT.jar. It comes back with the following. What's wrong with this? Ey-Chih Chow === Date: Sun, 8 Feb 2015 22:27:17 -0800Sending launch command to spark://ec2-54-213-73-150.us-west-2

RE: no space left at worker node

2015-02-08 Thread ey-chih chow
defaults0 0proc/proc procdefaults0 0/dev/sdb/mntauto defaults,noatime,nodiratime,comment=cloudconfig 0 0/dev/sdc/mnt2 autodefaults,noatime,nodiratime,comment=cloudconfig 0 0 There is no entry of /dev/xvdb. Ey-Chih Chow Date

RE: no space left at worker node

2015-02-08 Thread ey-chih chow
Thanks Gen. How can I check if /dev/sdc is well mounted or not? In general, the problem shows up when I submit the second or third job. The first job I submit most likely will succeed. Ey-Chih Chow Date: Sun, 8 Feb 2015 18:18:03 +0100 Subject: Re: no space left at worker node From: gen.tan

RE: no space left at worker node

2015-02-08 Thread ey-chih chow
. -Mike From: gen tang gen.tan...@gmail.com To: ey-chih chow eyc...@hotmail.com Cc: user@spark.apache.org user@spark.apache.org Sent: Sunday, February 8, 2015 6:09 AM Subject: Re: no space left at worker node Hi,I fact, I met this problem before. it is a bug of AWS. Which type

RE: no space left at worker node

2015-02-08 Thread ey-chih chow
Hi Gen, Thanks. I save my logs in a file under /var/log. This is the only place to save data. Will the problem go away if I use a better machine? Best regards, Ey-Chih Chow Date: Sun, 8 Feb 2015 23:32:27 +0100 Subject: Re: no space left at worker node From: gen.tan...@gmail.com To: eyc

RE: no space left at worker node

2015-02-08 Thread ey-chih chow
Is there any way we can disable Spark copying the jar file to the corresponding directory. I have a fat jar and is already copied to worker nodes using the command copydir. Why Spark needs to save the jar to ./spark/work/appid each time a job get started? Ey-Chih Chow Date: Sun, 8 Feb

RE: no space left at worker node

2015-02-08 Thread ey-chih chow
, the input and output paths of the job are all in s3. I did not use paths of hdfs as input or output. Best regards, Ey-Chih Chow From: eyc...@hotmail.com To: gen.tan...@gmail.com CC: user@spark.apache.org Subject: RE: no space left at worker node Date: Sun, 8 Feb 2015 14:57:15 -0800 Hi Gen

RE: no space left at worker node

2015-02-08 Thread ey-chih chow
By this way, the input and output paths of the job are all in s3. I did not use paths of hdfs as input or output. Best regards, Ey-Chih Chow From: eyc...@hotmail.com To: gen.tan...@gmail.com CC: user@spark.apache.org Subject: RE: no space left at worker node Date: Sun, 8 Feb 2015 14:57:15 -0800

no space left at worker node

2015-02-07 Thread ey-chih chow
30963708 1729652 27661192 6% /mnt Does anybody know how to fix this? Thanks. Ey-Chih Chow -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/no-space-left-at-worker-node-tp21545.html Sent from the Apache Spark User List mailing list archive

synchronously submitting spark jobs

2015-02-04 Thread ey-chih chow
Hi, I would like to submit spark jobs one by one, in that the next job will not be submitted until the previous one succeeds. Spark_submit can only submit jobs asynchronously. Is there any way I can submit jobs sequentially? Thanks. Ey-Chih Chow -- View this message in context: http

unknown issue in submitting a spark job

2015-01-29 Thread ey-chih chow
Hi, I submitted a job using spark-submit and got the following exception. Anybody knows how to fix this? Thanks. Ey-Chih Chow 15/01/29 08:53:10 INFO storage.BlockManagerMasterActor: Registering block manager ip-10-10-8-191.us-west-2

RE: unknown issue in submitting a spark job

2015-01-29 Thread ey-chih chow
The worker node has 15G memory, 1x32 GB SSD, and 2 core. The data file is from S3. If I don't set mapred.max.split.size, it is fine with only one partition. Otherwise, it will generate OOME. Ey-Chih Chow From: moham...@glassbeam.com To: eyc...@hotmail.com; user@spark.apache.org Subject: RE

RE: unknown issue in submitting a spark job

2015-01-29 Thread ey-chih chow
I use the default value, which I think is 512MB. If I change to 1024MB, Spark submit will fail due to not enough memory for rdd. Ey-Chih Chow From: moham...@glassbeam.com To: eyc...@hotmail.com; user@spark.apache.org Subject: RE: unknown issue in submitting a spark job Date: Fri, 30 Jan 2015 00

RE: spark 1.2 ec2 launch script hang

2015-01-28 Thread ey-chih chow
wrote: Try using an absolute path to the pem file On Jan 26, 2015, at 8:57 PM, ey-chih chow eyc...@hotmail.com wrote: Hi, I used the spark-ec2 script of spark 1.2 to launch a cluster. I have modified the script according to https://github.com/grzegorz-dubicki/spark/commit

spark 1.2 ec2 launch script hang

2015-01-26 Thread ey-chih chow
' state. Any additional thing I should do to make it succeed? Thanks. Ey-Chih Chow -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/spark-1-2-ec2-launch-script-hang-tp21381.html Sent from the Apache Spark User

RE: spark 1.1.0 save data to hdfs failed

2015-01-24 Thread ey-chih chow
for. For example I saw a reference to CDH 5.2 which is Hadoop 2.5, but then you're showing that you are running an old Hadoop 1.x HDFS? there seem to be a number of possible incompatibilities here. On Fri, Jan 23, 2015 at 11:38 PM, ey-chih chow eyc...@hotmail.com wrote: Sorry I still did

RE: spark 1.1.0 save data to hdfs failed

2015-01-24 Thread ey-chih chow
for. For example I saw a reference to CDH 5.2 which is Hadoop 2.5, but then you're showing that you are running an old Hadoop 1.x HDFS? there seem to be a number of possible incompatibilities here. On Fri, Jan 23, 2015 at 11:38 PM, ey-chih chow eyc...@hotmail.com wrote: Sorry I still did not quiet get your

RE: spark 1.1.0 save data to hdfs failed

2015-01-23 Thread ey-chih chow
1.1.0 save data to hdfs failed From: so...@cloudera.com To: eyc...@hotmail.com Are you receiving my replies? I have suggested a resolution. Look at the dependency tree next. On Jan 23, 2015 2:43 PM, ey-chih chow eyc...@hotmail.com wrote: I looked into the source code

RE: spark 1.1.0 save data to hdfs failed

2015-01-23 Thread ey-chih chow
the dependencies coming in. There may be many sources of a Hadoop dep. On Fri, Jan 23, 2015 at 1:05 AM, ey-chih chow eyc...@hotmail.com wrote: Thanks. But after I replace the maven dependence from dependency groupIdorg.apache.hadoop/groupId

RE: spark 1.1.0 save data to hdfs failed

2015-01-23 Thread ey-chih chow
? Date: Fri, 23 Jan 2015 17:01:48 + Subject: RE: spark 1.1.0 save data to hdfs failed From: so...@cloudera.com To: eyc...@hotmail.com Are you receiving my replies? I have suggested a resolution. Look at the dependency tree next. On Jan 23, 2015 2:43 PM, ey-chih chow eyc...@hotmail.com wrote

RE: spark 1.1.0 save data to hdfs failed

2015-01-23 Thread ey-chih chow
the warning message is still shown up in the namenode log. Is there any other thing I need to do? Thanks. Ey-Chih Chow From: so...@cloudera.com Date: Thu, 22 Jan 2015 22:34:22 + Subject: Re: spark 1.1.0 save data to hdfs failed To: eyc...@hotmail.com CC: yuzhih...@gmail.com; user

RE: spark 1.1.0 save data to hdfs failed

2015-01-23 Thread ey-chih chow
symptoms of mixing incompatible versions of libraries. I'm not suggesting you haven't excluded Spark / Hadoop, but, this is not the only way Hadoop deps get into your app. See my suggestion about investigating the dependency tree. On Fri, Jan 23, 2015 at 1:53 PM, ey-chih chow eyc...@hotmail.com

RE: spark 1.1.0 save data to hdfs failed

2015-01-23 Thread ey-chih chow
incompatible versions of libraries. I'm not suggesting you haven't excluded Spark / Hadoop, but, this is not the only way Hadoop deps get into your app. See my suggestion about investigating the dependency tree. On Fri, Jan 23, 2015 at 1:53 PM, ey-chih chow eyc...@hotmail.com wrote: Thanks

RE: spark 1.1.0 save data to hdfs failed

2015-01-22 Thread ey-chih chow
...@gmail.com CC: user@spark.apache.org Subject: RE: spark 1.1.0 save data to hdfs failed Date: Wed, 21 Jan 2015 23:12:56 -0800 The hdfs release should be hadoop 1.0.4. Ey-Chih Chow Date: Wed, 21 Jan 2015 16:56:25 -0800 Subject: Re: spark 1.1.0 save data to hdfs failed From: yuzhih...@gmail.com

RE: spark 1.1.0 save data to hdfs failed

2015-01-22 Thread ey-chih chow
/exclusion /exclusions /dependency the warning message is still shown up in the namenode log. Is there any other thing I need to do? Thanks. Ey-Chih Chow From: so...@cloudera.com Date: Thu, 22 Jan 2015 22:34:22 + Subject: Re

RE: Spark 1.1.0 - spark-submit failed

2015-01-21 Thread ey-chih chow
workerCount) was added in netty 3.5.4 Cheers On Tue, Jan 20, 2015 at 4:15 PM, ey-chih chow eyc...@hotmail.com wrote: Hi, I issued the following command in a ec2 cluster launched using spark-ec2: ~/spark/bin/spark-submit --class com.crowdstar.cluster.etl.ParseAndClean --master spark://ec2-54-185

spark 1.1.0 save data to hdfs failed

2015-01-21 Thread ey-chih chow
]], classOf[NullWritable], classOf[AvroKeyOutputFormat[GenericRecord]], job.getConfiguration) But it failed with the following error messages. Is there any people who can help? Thanks. Ey-Chih Chow

RE: spark 1.1.0 save data to hdfs failed

2015-01-21 Thread ey-chih chow
The hdfs release should be hadoop 1.0.4. Ey-Chih Chow Date: Wed, 21 Jan 2015 16:56:25 -0800 Subject: Re: spark 1.1.0 save data to hdfs failed From: yuzhih...@gmail.com To: eyc...@hotmail.com CC: user@spark.apache.org What hdfs release are you using ? Can you check namenode log around time

Spark 1.1.0 - spark-submit failed

2015-01-20 Thread ey-chih chow
on how to fix the problem? Thanks. Ey-Chih Chow == Launch Command: /usr/lib/jvm/java-1.7.0/bin/java -cp /root/spark/work/driver-20150120200843-/spark-etl-0.0.1-SNAPSHOT.jar/root/ephemeral-hdfs/conf:/root/spark/conf:/root/spark/lib/spark-assembly-1.1.0

serialization issue with mapPartitions

2014-12-25 Thread ey-chih chow
the following message? Cause: java.io.NotSerializableException: org.apache.hadoop.mapreduce.Job If I take out 'val config = job.getConfiguration()' in the mapPartitions, the code works fine, even through job.getConfiguration() shows up also in newAPIHadoopFile(). Ey-Chih Chow -- View

Re: serialization issue with mapPartitions

2014-12-25 Thread ey-chih chow
I should rephrase my question as follows: How to use the corresponding Hadoop Configuration of a HadoopRDD in defining a function as an input parameter to the MapPartitions function? Thanks. Ey-Chih Chow -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com

Debugging a Spark application using Eclipse throws SecurityException

2014-12-23 Thread ey-chih chow
. Best regards, Ey-Chih Chow -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Debugging-a-Spark-application-using-Eclipse-throws-SecurityException-tp20843.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Re: Debugging a Spark application using Eclipse throws SecurityException

2014-12-23 Thread ey-chih chow
It's working now. Probably I didn't specify the excluded list correctly. I kept revising it and now it's working. Thanks. Ey-Chih Chow -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Debugging-a-Spark-application-using-Eclipse-throws-SecurityException