Re: [VOTE][SPIP] SPARK-21866 Image support in Apache Spark

2017-09-21 Thread Denny Lee
+1 On Thu, Sep 21, 2017 at 11:15 Sean Owen wrote: > Am I right that this doesn't mean other packages would use this > representation, but that they could? > > The representation looked fine to me w.r.t. what DL frameworks need. > > My previous comment was that this is

Re: [discuss] Data Source V2 write path

2017-09-21 Thread Ryan Blue
> input data requirement Clustering and sorting within partitions are a good start. We can always add more later when they are needed. The primary use case I'm thinking of for this is partitioning and bucketing. If I'm implementing a partitioned table format, I need to tell Spark to cluster by

Re: [discuss] Data Source V2 write path

2017-09-21 Thread Reynold Xin
Ah yes I agree. I was just saying it should be options (rather than specific constructs). Having them at creation time makes a lot of sense. Although one tricky thing is what if they need to change, but we can probably just special case that. On Thu, Sep 21, 2017 at 6:28 PM Ryan Blue

Re: [discuss] Data Source V2 write path

2017-09-21 Thread Ryan Blue
I’d just pass them [partitioning/bucketing] as options, until there are clear (and strong) use cases to do them otherwise. I don’t think it makes sense to pass partitioning and bucketing information *into* this API. The writer should already know the table structure and should pass relevant

Re: New to dev community | Contribution to Mlib

2017-09-21 Thread Venali Sonone
Thank you for your response. The algorithm that I am proposing is Isolation Forest. Link to paper: paper . I particularly find that it should be included in Spark ML because so many applications that use Spark as part of real time

Re: [VOTE] Spark 2.1.2 (RC1)

2017-09-21 Thread Marcelo Vanzin
While you're at it, one thing that needs to be done is create a 2.1.3 version on JIRA. Not sure if you have enough permissions to do that. Fixes after an RC should use the new version, and if you create a new RC, you'll need to go and backdate the patches that went into the new RC. On Mon, Sep

Re: [VOTE][SPIP] SPARK-21866 Image support in Apache Spark

2017-09-21 Thread Sean Owen
Am I right that this doesn't mean other packages would use this representation, but that they could? The representation looked fine to me w.r.t. what DL frameworks need. My previous comment was that this is actually quite lightweight. It's kind of like how I/O support is provided for CSV and

[VOTE][SPIP] SPARK-21866 Image support in Apache Spark

2017-09-21 Thread Tim Hunter
Hello community, I would like to call for a vote on SPARK-21866. It is a short proposal that has important applications for image processing and deep learning. Joseph Bradley has offered to be the shepherd. JIRA ticket: https://issues.apache.org/jira/browse/SPARK-21866 PDF version:

Re: doc patch review

2017-09-21 Thread lucas.g...@gmail.com
https://issues.apache.org/jira/browse/SPARK-20448 On 21 September 2017 at 04:09, Hyukjin Kwon wrote: > I think it would have been nicer if the JIRA and PR are written in this > email. > > 2017-09-21 19:44 GMT+09:00 Steve Loughran : > >> I have a doc

Re: doc patch review

2017-09-21 Thread Hyukjin Kwon
I think it would have been nicer if the JIRA and PR are written in this email. 2017-09-21 19:44 GMT+09:00 Steve Loughran : > I have a doc patch on spark streaming & object store sources which has > been hitting is six-month-unreviewed state this week > > are there any

doc patch review

2017-09-21 Thread Steve Loughran
I have a doc patch on spark streaming & object store sources which has been hitting is six-month-unreviewed state this week are there any plans to review this or shall I close it as a wontfix? thanks - To unsubscribe e-mail: