Accumulators displayed in SparkUI in 1.4.1?

2016-05-25 Thread Daniel Barclay

Was the feature of displaying accumulators in the Spark UI implemented in Spark 
1.4.1, or was that added later?

Thanks,
Daniel




-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



How to convert from DataFrame to Dataset[Row]?

2016-07-15 Thread Daniel Barclay

In Spark 1.6.1, how can I convert a DataFrame to a Dataset[Row]?

Is there a direct conversion?  (Trying .as[Row] doesn't work,
even after importing  .implicits._ .)

Is there some way to map the Rows from the Dataframe into the Dataset[Row]?
(DataFrame.map would just make another Dataframe, right?)


Thanks,
Daniel

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Why the json file used by sparkSession.read.json must be a valid json object per line

2016-10-18 Thread Daniel Barclay

Koert,

Koert Kuipers wrote:


A single json object would mean for most parsers it needs to fit in memory when 
reading or writing


Note that codlife didn't seem to being asking about /single-object/ JSON files, 
but about /standard-format/ JSON files.


On Oct 15, 2016 11:09, "codlife" <1004910...@qq.com > 
wrote:

Hi:
   I'm doubt about the design of spark.read.json,  why the json file is not
a standard json file, who can tell me the internal reason. Any advice is
appreciated.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Why-the-json-file-used-by-sparkSession-read-json-must-be-a-valid-json-object-per-line-tp27907.html
 

Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org 






Re: CSV escaping not working

2016-10-28 Thread Daniel Barclay

In any case, it seems that the current behavior is not documented sufficiently.

Koert Kuipers wrote:

i can see how unquoted csv would work if you escape delimiters, but i have 
never seen that in practice.

On Thu, Oct 27, 2016 at 2:03 PM, Jain, Nishit mailto:nja...@underarmour.com>> wrote:

I’d think quoting is only necessary if you are not escaping delimiters in 
data. But we can only share our opinions. It would be good to see something 
documented.
This may be the cause of the issue?: 
https://issues.apache.org/jira/browse/CSV-135 


From: Koert Kuipers mailto:ko...@tresata.com>>
Date: Thursday, October 27, 2016 at 12:49 PM

To: "Jain, Nishit" mailto:nja...@underarmour.com>>
Cc: "user@spark.apache.org " mailto:user@spark.apache.org>>
Subject: Re: CSV escaping not working

well my expectation would be that if you have delimiters in your data you 
need to quote your values. if you now have quotes without your data you need to 
escape them.

so escaping is only necessary if quoted.

On Thu, Oct 27, 2016 at 1:45 PM, Jain, Nishit mailto:nja...@underarmour.com>> wrote:

Do you mind sharing why should escaping not work without quotes?

From: Koert Kuipers mailto:ko...@tresata.com>>
Date: Thursday, October 27, 2016 at 12:40 PM
To: "Jain, Nishit" mailto:nja...@underarmour.com>>
Cc: "user@spark.apache.org " 
mailto:user@spark.apache.org>>
Subject: Re: CSV escaping not working

that is what i would expect: escaping only works if quoted

On Thu, Oct 27, 2016 at 1:24 PM, Jain, Nishit mailto:nja...@underarmour.com>> wrote:

Interesting finding: Escaping works if data is quoted but not 
otherwise.

From: "Jain, Nishit" mailto:nja...@underarmour.com>>
Date: Thursday, October 27, 2016 at 10:54 AM
To: "user@spark.apache.org " 
mailto:user@spark.apache.org>>
Subject: CSV escaping not working

I am using spark-core version 2.0.1 with Scala 2.11. I have simple 
code to read a csv file which has \ escapes.

|val myDA = spark.read .option("quote",null) .schema(mySchema) 
.csv(filePath) |

As per documentation \ is default escape for csv reader. But it 
does not work. Spark is reading \ as part of my data. For Ex: City column in 
csv file is *north rocks\,au* . I am expecting city column should read in code 
as *northrocks,au*. But instead spark reads it as *northrocks\* and moves *au* 
to next column.

I have tried following but did not work:

  * Explicitly defined escape .option("escape",”\\")
  * Changed escape to | or : in file and in code
  * I have tried using spark-csv library

Any one facing same issue? Am I missing something?

Thanks







-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org