??????Classpath problem trying to use DataFrames

2015-12-12 Thread Ricky
encountor similar problems using hivecontext .When code print classload ,it was changed to multiclassloader from APPclassloader -- -- ??: Harsh J : 2015??12??12?? 12:09 ??: Christopher Brady

Concatenate a string to a Column of type string in DataFrame

2015-12-12 Thread satish chandra j
HI, I am trying to update a column value in DataFrame, incrementing a column of integer data type than the below code works val new_df=old_df.select(df("Int_Column")+10) If I implement the similar approach for appending a string to a column of string datatype as below than it does not error out

Re: Re: Spark assembly in Maven repo?

2015-12-12 Thread Sean Owen
That's exactly what the various artifacts in the Maven repo are for. The API classes for core are in the core artifact and so on. You don't need an assembly. On Sat, Dec 12, 2015 at 12:32 AM, Xiaoyong Zhu wrote: > Yes, so our scenario is to treat the spark assembly as an

Re: spark data frame write.mode("append") bug

2015-12-12 Thread sri hari kali charan Tummala
Hi All, https://github.com/apache/spark/blob/branch-1.5/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala#L48 In Present spark version in line 48 there is a bug, to check whether table exists in a database using limit doesnt work for all databases sql server

Re: spark data frame write.mode("append") bug

2015-12-12 Thread kali.tumm...@gmail.com
Hi All, https://github.com/apache/spark/blob/branch-1.5/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala#L48 In Present spark version in line 48 there is a bug, to check whether table exists in a database using limit doesnt work for all databases sql

Re: Release data for spark 1.6?

2015-12-12 Thread Ted Yu
Please take a look at SPARK-9078 which allows jdbc dialects to override the query for checking table existence. > On Dec 12, 2015, at 7:12 PM, sri hari kali charan Tummala > wrote: > > Hi Michael, Ted, > >

Re: Release data for spark 1.6?

2015-12-12 Thread sri hari kali charan Tummala
thanks Sean and Ted, I will wait for 1.6 to be out. Happy Christmas to all ! Thanks Sri On Sat, Dec 12, 2015 at 12:18 PM, Ted Yu wrote: > Please take a look at SPARK-9078 which allows jdbc dialects to override > the query for checking table existence. > > On Dec 12, 2015,

Spark does not clean garbage in blockmgr folders on slaves if long running spark-shell is used

2015-12-12 Thread Alexander Pivovarov
Recently I faced an issue with Spark 1.5.2 standalone. Spark does not clean garbage in blockmgr folders on slaves until I exit from spark-shell. I opened spark-shell and run my spark program for several input folders. Then I noticed that Spark uses several GBs of disk space on all slaves in

Re: Release data for spark 1.6?

2015-12-12 Thread sri hari kali charan Tummala
Hi Michael, Ted, https://github.com/apache/spark/blob/branch-1.5/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala#L48 In Present spark version in line 48 there is a bug, to check whether table exists in a database using limit doesnt work for all databases

Re: Concatenate a string to a Column of type string in DataFrame

2015-12-12 Thread Yanbo Liang
Hi Satish, You can refer the following code snippet: df.select(concat(col("String_Column"), lit("00:00:000"))) Yanbo 2015-12-12 16:01 GMT+08:00 satish chandra j : > HI, > I am trying to update a column value in DataFrame, incrementing a column > of integer data type

RE: Concatenate a string to a Column of type string in DataFrame

2015-12-12 Thread Satish
Hi, Will the below mentioned snippet work for Spark 1.4.0 Thanks for your inputs Regards, Satish -Original Message- From: "Yanbo Liang" Sent: ‎12-‎12-‎2015 20:54 To: "satish chandra j" Cc: "user" Subject: Re:

Has the format of a spark jar file changes in 1.5

2015-12-12 Thread Steve Lewis
I have been using my own code to build the jar file I use for spark submit. In 1.4 I could simply add all class and resource files I find in the class path to the jar and add all jars in the classpath into a directory called lib in the jar file. In 1.5 I see that resources and classes in jars in

How to use HProf to profile Spark CPU overhead

2015-12-12 Thread Jia Zou
My goal is to use hprof to profile where the bottleneck is. Is there anyway to do this without modifying and rebuilding Spark source code. I've tried to add " -Xrunhprof:cpu=samples,depth=100,interval=20,lineno=y,thread=y,file=/home/ubuntu/out.hprof" to spark-class script, but it can only profile

Re: spark data frame write.mode("append") bug

2015-12-12 Thread Michael Armbrust
If you want to contribute to the project open a JIRA/PR: https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark On Sat, Dec 12, 2015 at 3:13 AM, kali.tumm...@gmail.com < kali.tumm...@gmail.com> wrote: > Hi All, > > >

Re: How to use HProf to profile Spark CPU overhead

2015-12-12 Thread Ted Yu
Have you tried adding the option below through spark.executor.extraJavaOptions ? Cheers > On Dec 13, 2015, at 3:36 AM, Jia Zou wrote: > > My goal is to use hprof to profile where the bottleneck is. > Is there anyway to do this without modifying and rebuilding Spark

Re: How to use HProf to profile Spark CPU overhead

2015-12-12 Thread Jia Zou
Hi, Ted, it works, thanks a lot for your help! --Jia On Sat, Dec 12, 2015 at 3:01 PM, Ted Yu wrote: > Have you tried adding the option below through > spark.executor.extraJavaOptions ? > > Cheers > > > On Dec 13, 2015, at 3:36 AM, Jia Zou wrote: >