[no subject]

2016-10-06 Thread ayan guha
Hi Faced one issue: - Writing Hive Partitioned table using df.withColumn("partition_date",to_date(df["INTERVAL_DATE"])).write.partitionBy('partition_date').saveAsTable("sometable",mode="overwrite") - Data got written to HDFS fine. I can see the folders with partition names such as

Re: spark standalone with multiple workers gives a warning

2016-10-06 Thread Ofer Eliassaf
The slaves should connect to the master using the scripts in sbin... You can read about it here: http://spark.apache.org/docs/latest/spark-standalone.html On Thu, Oct 6, 2016 at 6:46 PM, Mendelson, Assaf wrote: > Hi, > > I have a spark standalone cluster. On it, I am

spark stateful streaming error

2016-10-06 Thread backtrack5
I am using pyspark stateful stream (2.0), which receives JSON from Socket. I am getting the following error, When i send more then one records. meaning if i send only one message i am getting response. If i send more than one message getting following error, def createmd5Hash(po): data =

Re: spark 2.0.1 upgrade breaks on WAREHOUSE_PATH

2016-10-06 Thread Koert Kuipers
if the intention is to create this on the default hadoop filesystem (and not local), then maybe we can use FileSystem.getHomeDirectory()? it should return the correct home directory on the relevant FileSystem (local or hdfs). if the intention is to create this only locally, then why bother using

Spark REST API YARN client mode is not full?

2016-10-06 Thread Vladimir Tretyakov
Hi, When I start Spark v1.6 (cdh5.8.0) in Yarn client mode I see that 4040 port is avaiable, but UI shows nothing and API returns not full information. I started Spark application like this: spark-submit --master yarn-client --class org.apache.spark.examples.SparkPi

Re: DataFrame Sort gives Cannot allocate a page with more than 17179869176 bytes

2016-10-06 Thread amarouni
You can get some more insights by using the Spark history server (http://spark.apache.org/docs/latest/monitoring.html), it can show you which task is failing and some other information that might help you debugging the issue. On 05/10/2016 19:00, Babak Alipour wrote: > The issue seems to lie in

Re: building Spark 2.1 vs Java 1.8 on Ubuntu 16/06

2016-10-06 Thread Marco Mistroni
Thanks Fred The build/mvn will trigger compilation using zinc and I want to avoid that as every time I have tried it runs into errors while compiling spark core. How can I disable zinc by default? Kr On 5 Oct 2016 10:53 pm, "Fred Reiss" wrote: > Actually the memory options

Re: pyspark: sqlContext.read.text() does not work with a list of paths

2016-10-06 Thread Hyukjin Kwon
It seems obviously a bug. It was introduced from my PR, https://github.com/apache/spark/commit/d37c7f7f042f7943b5b684e53cf4284c601fb347 +1 for creating a JIRA and PR. If you have any problem with this, I would like to do this quickly. On 5 Oct 2016 9:12 p.m., "Laurent Legrand"