Re: Enabling mapreduce.input.fileinputformat.list-status.num-threads in Spark?

2016-01-12 Thread Cheolsoo Park
eads"? > > Thanks. > > On Thu, Jul 23, 2015 at 8:50 PM, Cheolsoo Park <piaozhe...@gmail.com> > wrote: > >> Hi, >> >> I am wondering if anyone has successfully enabled >> "mapreduce.input.fileinputformat.list-status.num-threads" in Spar

Enabling mapreduce.input.fileinputformat.list-status.num-threads in Spark?

2015-07-23 Thread Cheolsoo Park
Hi, I am wondering if anyone has successfully enabled mapreduce.input.fileinputformat.list-status.num-threads in Spark jobs. I usually set this property to 25 to speed up file listing in MR jobs (Hive and Pig). But for some reason, this property does not take effect in Spark HadoopRDD resulting

Re: SparkSQL failing while writing into S3 for 'insert into table'

2015-05-23 Thread Cheolsoo Park
It seems it generated query results into tmp dir firstly, and tries to rename it into the right folder finally. But, it failed while renaming it. This problem exists not only in SparkSQL but also in any Hadoop tools (e.g. Hive, Pig, etc) when using with s3. Usually, It is better to write task

Re: dynamicAllocation spark-shell

2015-04-23 Thread Cheolsoo Park
Hi, Attempted to request a negative number of executor(s) -663 from the cluster manager. Please specify a positive number! This is a bug in dynamic allocation. Here is the jira- https://issues.apache.org/jira/browse/SPARK-6954 Thanks! Cheolsoo On Thu, Apr 23, 2015 at 7:57 AM, Michael Stone