date:20160308

Re: BUILD FAILURE due to...Unable to find configuration file at location dev/scalastyle-config.xml

2016-03-08 Thread Jacek Laskowski

Hi, At first glance it appears the commit *yesterday* (Warsaw time) broke the build :( https://github.com/apache/spark/commit/0eea12a3d956b54bbbd73d21b296868852a04494 Pozdrawiam, Jacek Laskowski https://medium.com/@jaceklaskowski/ Mastering Apache Spark http://bit.ly/mastering-apache-spark

Re: BUILD FAILURE due to...Unable to find configuration file at location dev/scalastyle-config.xml

2016-03-08 Thread Dongjoon Hyun

Hi, I updated PR https://github.com/apache/spark/pull/11567. But, `lint-java` fails if that file is in the dev folder. (Jenkins fails, too.) So, inevitably, I changed pom.xml instead. Dongjoon. On Mon, Mar 7, 2016 at 11:40 PM, Jacek Laskowski wrote: > Hi, > > At first glance it appears the c

Re: ML ALS API

2016-03-08 Thread Nick Pentreath

Hi Maciej Yes, that *train* method is intended to be public, but it is marked as *DeveloperApi*, which means that backward compatibility is not necessarily guaranteed, and that method may change. Having said that, even APIs marked as DeveloperApi do tend to be relatively stable. As the comment me

Spark structured streaming

2016-03-08 Thread Praveen Devarao

Hi, I would like to get my hands on the structured streaming feature coming out in Spark 2.0. I have tried looking around for code samples to get started but am not able to find any. Only few things I could look into is the test cases that have been committed under the JIRA umbrella ht

Re: Spark structured streaming

2016-03-08 Thread Jacek Laskowski

Hi Praveen, I've spent few hours on the changes related to streaming dataframes (included in the SPARK-8360) and concluded that it's currently only possible to read.stream(), but not write.stream() since there are no streaming Sinks yet. Pozdrawiam, Jacek Laskowski https://medium.com/@jacekl

Re: Spark structured streaming

2016-03-08 Thread Praveen Devarao

Thanks Jacek for the pointer. Any idea which package can be used in .format(). The test cases seem to work out of the DefaultSource class defined within the DataFrameReaderWriterSuite [ org.apache.spark.sql.streaming.test.DefaultSource] Thanking You -

Re: Use cases for kafka direct stream messageHandler

2016-03-08 Thread Cody Koeninger

No, looks like you'd have to catch them in the serializer and have the serializer return option or something. The new consumer builds a buffer full of records, not one at a time. On Mar 8, 2016 4:43 AM, "Marius Soutier" wrote: > > > On 04.03.2016, at 22:39, Cody Koeninger wrote: > > > > The only

Re: Spark structured streaming

2016-03-08 Thread Jacek Laskowski

Hi Praveen, I don't really know. I think TD or Michael should know as they personally involved in the task (as far as I could figure it out from the JIRA and the changes). Ping people on the JIRA so they notice your question(s). Pozdrawiam, Jacek Laskowski https://medium.com/@jaceklaskowski/

Re: Spark structured streaming

2016-03-08 Thread Michael Armbrust

This is in active development, so there is not much that can be done from an end user perspective. In particular the only sink that is available in apache/master is a testing sink that just stores the data in memory. We are working on a parquet based file sink and will eventually support all the

Re: [VOTE] Release Apache Spark 1.6.1 (RC1)

2016-03-08 Thread Yin Huai

+1 On Mon, Mar 7, 2016 at 12:39 PM, Reynold Xin wrote: > +1 (binding) > > > On Sun, Mar 6, 2016 at 12:08 PM, Egor Pahomov > wrote: > >> +1 >> >> Spark ODBC server is fine, SQL is fine. >> >> 2016-03-03 12:09 GMT-08:00 Yin Yang : >> >>> Skipping docker tests, the rest are green: >>> >>> [INFO] S

Re: [VOTE] Release Apache Spark 1.6.1 (RC1)

2016-03-08 Thread Andrew Or

+1 2016-03-08 10:59 GMT-08:00 Yin Huai : > +1 > > On Mon, Mar 7, 2016 at 12:39 PM, Reynold Xin wrote: > >> +1 (binding) >> >> >> On Sun, Mar 6, 2016 at 12:08 PM, Egor Pahomov >> wrote: >> >>> +1 >>> >>> Spark ODBC server is fine, SQL is fine. >>> >>> 2016-03-03 12:09 GMT-08:00 Yin Yang : >>> >>

Re: [VOTE] Release Apache Spark 1.6.1 (RC1)

2016-03-08 Thread Burak Yavuz

+1 On Tue, Mar 8, 2016 at 10:59 AM, Andrew Or wrote: > +1 > > 2016-03-08 10:59 GMT-08:00 Yin Huai : > >> +1 >> >> On Mon, Mar 7, 2016 at 12:39 PM, Reynold Xin wrote: >> >>> +1 (binding) >>> >>> >>> On Sun, Mar 6, 2016 at 12:08 PM, Egor Pahomov >>> wrote: >>> +1 Spark ODBC server

Spark 2.0 high level API doc

2016-03-08 Thread Reynold Xin

Hi all, As most of you know, we are doing some API changes in Spark 2.0 to prepare Spark for the future, and a lot of focus there are on DataFrames and Datasets. We wrote a high level API doc and uploaded it to JIRA last week, but I don't think a lot of people monitor JIRA as closely. Here's a lin

Spark Scheduler creating Straggler Node

2016-03-08 Thread Prabhu Joseph

Hi All, When a Spark Job is running, and one of the Spark Executor on Node A has some partitions cached. Later for some other stage, Scheduler tries to assign a task to Node A to process a cached partition (PROCESS_LOCAL). But meanwhile the Node A is occupied with some other tasks and got busy

Re: Spark Scheduler creating Straggler Node

2016-03-08 Thread Reynold Xin

You just want to be able to replicate hot cached blocks right? On Tuesday, March 8, 2016, Prabhu Joseph wrote: > Hi All, > > When a Spark Job is running, and one of the Spark Executor on Node A > has some partitions cached. Later for some other stage, Scheduler tries to > assign a task to No

Inconsistent file extensions and omitting file extensions written by CSV, TEXT and JSON data sources.

2016-03-08 Thread Hyukjin Kwon

Hi all, Currently, the output from CSV, TEXT and JSON data sources does not have file extensions such as .csv, .txt and .json (except for compression extensions such as .gz, .deflate and .bz4). In addition, it looks Parquet has the extensions such as .gz.parquet or .snappy.parquet according to co

Re: Inconsistent file extensions and omitting file extensions written by CSV, TEXT and JSON data sources.

2016-03-08 Thread Reynold Xin

Isn't this just specified by the user? On Tue, Mar 8, 2016 at 9:49 PM, Hyukjin Kwon wrote: > Hi all, > > Currently, the output from CSV, TEXT and JSON data sources does not have > file extensions such as .csv, .txt and .json (except for compression > extensions such as .gz, .deflate and .bz4).

Re: Spark Scheduler creating Straggler Node

2016-03-08 Thread Prabhu Joseph

I don't just want to replicate all Cached Blocks. I am trying to find a way to solve the issue which i mentioned above mail. Having replicas for all cached blocks will add more cost to customers. On Wed, Mar 9, 2016 at 9:50 AM, Reynold Xin wrote: > You just want to be able to replicate hot cac

Re: BUILD FAILURE due to...Unable to find configuration file at location dev/scalastyle-config.xml

Re: BUILD FAILURE due to...Unable to find configuration file at location dev/scalastyle-config.xml

Re: ML ALS API

Spark structured streaming

Re: Spark structured streaming

Re: Spark structured streaming

Re: Use cases for kafka direct stream messageHandler

Re: Spark structured streaming

Re: Spark structured streaming

Re: [VOTE] Release Apache Spark 1.6.1 (RC1)

Re: [VOTE] Release Apache Spark 1.6.1 (RC1)

Re: [VOTE] Release Apache Spark 1.6.1 (RC1)

Spark 2.0 high level API doc

Spark Scheduler creating Straggler Node

Re: Spark Scheduler creating Straggler Node

Inconsistent file extensions and omitting file extensions written by CSV, TEXT and JSON data sources.

Re: Inconsistent file extensions and omitting file extensions written by CSV, TEXT and JSON data sources.

Re: Spark Scheduler creating Straggler Node

18 matches

Site Navigation

Mail list logo

Footer information