Re: Spark CSV Quote only NOT NULL

2019-07-11 Thread Swetha Ramaiah
Hi Anil That was an example. You can replace quote with what double quotes. But these options should give you an idea on how you want treat nulls, empty values and quotes. When I faced this issues, I forked Spark repo and looked at the test suite. This definitely helped me solve my issue.

Re: Problems running TPC-H on Raspberry Pi Cluster

2019-07-11 Thread Reynold Xin
I don't think Spark is meant to run with 1GB of memory on the entire system. The JVM loads almost 200MB of bytecode, and each page during query processing takes a min of 64MB. Maybe on the 4GB model of raspberry pi 4. On Wed, Jul 10, 2019 at 7:57 AM, agg212 < alexander_galaka...@brown.edu >

Re: Spark Write method not ignoring double quotes in the csv file

2019-07-11 Thread Aayush Ranaut
Question 2: You might be creating a dataframe while reading a parquet file. df = spark.read.load(“file.parquet”) df.select(rtrim(“columnName”)); Regards Prathmesh Ranaut https://linkedin.com/in/prathmeshranaut > On Jul 12, 2019, at 9:15 AM, anbutech wrote: > > Hello All, Could you please

Spark Write method not ignoring double quotes in the csv file

2019-07-11 Thread anbutech
Hello All, Could you please help me to fix the below questions Question 1: I have tried the below options while writing the final data in a csv file to ignore double quotes in the same csv file .nothing is worked. I'm using spark version 2.2 and scala version 2.11 . option("quote", "\"")

Re: Spark CSV Quote only NOT NULL

2019-07-11 Thread Anil Kulkarni
Hi Swetha, Thank you. But we need the data to be quoted with ". and when a field is null, we dont need the quotes around it. Example: "A",,"B","C" Thanks Anil On Thu, Jul 11, 2019, 1:51 PM Swetha Ramaiah wrote: > If you are using Spark 2.4.0, I think you can try something like this: > >

Re: [Beginner] Run compute on large matrices and return the result in seconds?

2019-07-11 Thread Steven Stetzler
Hi Gautham, I am a beginner spark user too and I may not have a complete understanding of your question, but I thought I would start a discussion anyway. Have you looked into using Spark's built in Correlation function? ( https://spark.apache.org/docs/latest/ml-statistics.html) This might let you

Re: Spark CSV Quote only NOT NULL

2019-07-11 Thread Swetha Ramaiah
If you are using Spark 2.4.0, I think you can try something like this: .option("quote", "\u") .option("emptyValue", “”) .option("nullValue", null) Regards Swetha > On Jul 11, 2019, at 1:45 PM, Anil Kulkarni wrote: > > Hi Spark users, > > My question is : > I am writing a Dataframe to

Spark CSV Quote only NOT NULL

2019-07-11 Thread Anil Kulkarni
Hi Spark users, My question is : I am writing a Dataframe to csv. Option i am using as .option("quoteAll","true"). This is quoting even null values and making them appear as an empty string. How do i make sure that quotes are enabled only for non null values? -- Cheers, Anil Kulkarni

Re: Release Apache Spark 2.4.4 before 3.0.0

2019-07-11 Thread Jacek Laskowski
Hi, Thanks Dongjoon Hyun for stepping up as a release manager! Much appreciated. If there's a volunteer to cut a release, I'm always to support it. In addition, the more frequent releases the better for end users so they have a choice to upgrade and have all the latest fixes or wait. It's their

Re: Spark Newbie question

2019-07-11 Thread infa elance
Thanks Jerry for the clarification. Ajay. On Thu, Jul 11, 2019 at 12:48 PM Jerry Vinokurov wrote: > Hi Ajay, > > When a Spark SQL statement references a table, that table has to be > "registered" first. Usually the way this is done is by reading in a > DataFrame, then calling the

unsubscribe

2019-07-11 Thread Bill Bejeck
unsubscribe

Re: Spark Newbie question

2019-07-11 Thread Jerry Vinokurov
Hi Ajay, When a Spark SQL statement references a table, that table has to be "registered" first. Usually the way this is done is by reading in a DataFrame, then calling the createOrReplaceTempView (or one of a few other functions) on that data frame, with the argument being the name under which

Re: Release Apache Spark 2.4.4 before 3.0.0

2019-07-11 Thread Dongjoon Hyun
Additionally, one more correctness patch landed yesterday. - SPARK-28015 Check stringToDate() consumes entire input for the and -[m]m formats Bests, Dongjoon. On Tue, Jul 9, 2019 at 10:11 AM Dongjoon Hyun wrote: > Thank you for the reply, Sean. Sure. 2.4.x should be a LTS

Re: Spark Newbie question

2019-07-11 Thread infa elance
Sorry, i guess i hit the send button too soon This question is regarding a spark stand-alone cluster. My understanding is spark is an execution engine and not a storage layer. Spark processes data in memory but when someone refers to a spark table created through sparksql(df/rdd) what exactly

Spark Newbie question

2019-07-11 Thread infa elance
This is stand-alone spark cluster. My understanding is spark is an execution engine and not a storage layer. Spark processes data in memory but when someone refers to a spark table created through sparksql(df/rdd) what exactly are they referring to? Could it be a Hive table? If yes, is it the

RE: [Beginner] Run compute on large matrices and return the result in seconds?

2019-07-11 Thread Gautham Acharya
Ping? I would really appreciate advice on this! Thank you! From: Gautham Acharya Sent: Tuesday, July 9, 2019 4:22 PM To: user@spark.apache.org Subject: [Beginner] Run compute on large matrices and return the result in seconds? This is my first email to this mailing list, so I apologize if I

Re: Help: What's the biggest length of SQL that's supported in SparkSQL?

2019-07-11 Thread Reynold Xin
There is no explicit limit but a JVM string cannot be bigger than 2G. It will also at some point run out of memory with too big of a query plan tree or become incredibly slow due to query planning complexity. I've seen queries that are tens of MBs in size. On Thu, Jul 11, 2019 at 5:01 AM, 李书明

How to pass Datasets as arguments to user defined function of a class

2019-07-11 Thread Shyam P
Hi, Anyhelp is thankful. https://stackoverflow.com/questions/56991447/in-spark-dataset-s-can-be-passed-as-input-args-to-a-function-to-get-out-put-args Regards, Shyam