date:20190130

Re: [VOTE] [SPARK-25994] SPIP: DataFrame-based Property Graphs, Cypher Queries, and Algorithms

2019-01-30 Thread Mingjie Tang

+1, this is a very very important feature.

Mingjie

On Thu, Jan 31, 2019 at 12:42 AM Xiao Li  wrote:

> Change my vote from +1 to ++1
>
> Xiangrui Meng  于2019年1月30日周三 上午6:20写道：
>
>> Correction: +0 vote doesn't mean "Don't really care". Thanks Ryan for the
>> offline reminder! Below is the Apache official interpretation
>> 
>> of fraction values:
>>
>> The in-between values are indicative of how strongly the voting
>> individual feels. Here are some examples of fractional votes and ways in
>> which they might be intended and interpreted:
>> +0: 'I don't feel strongly about it, but I'm okay with this.'
>> -0: 'I won't get in the way, but I'd rather we didn't do this.'
>> -0.5: 'I don't like this idea, but I can't find any rational
>> justification for my feelings.'
>> ++1: 'Wow! I like this! Let's do it!'
>> -0.9: 'I really don't like this, but I'm not going to stand in the way if
>> everyone else wants to go ahead with it.'
>> +0.9: 'This is a cool idea and i like it, but I don't have time/the
>> skills necessary to help out.'
>>
>>
>> On Wed, Jan 30, 2019 at 12:31 AM Martin Junghanns
>>  wrote:
>>
>>> Hi Dongjoon,
>>>
>>> Thanks for the hint! I updated the SPIP accordingly.
>>>
>>> I also changed the access permissions for the SPIP and design sketch
>>> docs so that anyone can comment.
>>>
>>> Best,
>>>
>>> Martin
>>> On 29.01.19 18:59, Dongjoon Hyun wrote:
>>>
>>> Hi, Xiangrui Meng.
>>>
>>> +1 for the proposal.
>>>
>>> However, please update the following section for this vote. As we see,
>>> it seems to be inaccurate because today is Jan. 29th. (Almost February).
>>> (Since I cannot comment on the SPIP, I replied here.)
>>>
>>> Q7. How long will it take?
>>>
>>>-
>>>
>>>If accepted by the community by the end of December 2018, we predict
>>>to be feature complete by mid-end March, allowing for QA during April 
>>> 2019,
>>>making the SPIP part of the next major Spark release (3.0, ETA May, 
>>> 2019).
>>>
>>> Bests,
>>> Dongjoon.
>>>
>>> On Tue, Jan 29, 2019 at 8:52 AM Xiao Li  wrote:
>>>
 +1

 Jules Damji  于2019年1月29日周二 上午8:14写道：

> +1 (non-binding)
> (Heard their proposed tech-talk at Spark + A.I summit in London. Well
> attended & well received.)
>
> —
> Sent from my iPhone
> Pardon the dumb thumb typos :)
>
> On Jan 29, 2019, at 7:30 AM, Denny Lee  wrote:
>
> +1
>
> yay - let's do it!
>
> On Tue, Jan 29, 2019 at 6:28 AM Xiangrui Meng 
> wrote:
>
>> Hi all,
>>
>> I want to call for a vote of SPARK-25994
>> . It introduces a
>> new DataFrame-based component to Spark, which supports property graph
>> construction, Cypher queries, and graph algorithms. The proposal
>> 
>> was made available on user@
>> 
>> and dev@
>> 
>>  to
>> collect input. You can also find a sketch design doc attached to
>> SPARK-26028 .
>>
>> The vote will be up for the next 72 hours. Please reply with your
>> vote:
>>
>> +1: Yeah, let's go forward and implement the SPIP.
>> +0: Don't really care.
>> -1: I don't think this is a good idea because of the following
>> technical reasons.
>>
>> Best,
>> Xiangrui
>>
>

Re: [VOTE] Release Apache Spark 2.3.3 (RC1)

2019-01-30 Thread Jungtaek Lim

Please proceed without SPARK-26154 given that it is unlikely expected to
get merged in one week. The patch needs some more work, and we still
haven't reached consensus on the approach.

Btw, could one of committer justify and modify the priority and correctness
label on SPARK-26154? I mentioned some committers before and got no
response.

2019년 1월 29일 (화) 오전 10:23, Takeshi Yamamuro 님이 작성:

> If there is no objection in following responses, I'll wait one more week
> while watching that PR progress.
> Once that PR merged, I'll start to prepare the next vote.
>
>
>
> On Tue, Jan 29, 2019 at 4:57 AM Jungtaek Lim  wrote:
>
>> Regarding PR 23634, it is waiting for getting consensus on the approach
>> for the fix, as well as it also needs to have some time to clean up some
>> code and move focus to concern backward compatibility. I'm postponing these
>> works since I haven't reached consensus on the approach.
>>
>> So it may take some days or even some weeks to get PR 23634 merged (if
>> consensus will not be made in time).
>>
>> 2019년 1월 29일 (화) 오전 2:18, Sean Owen 님이 작성:
>>
>>> More analysis at https://github.com/apache/spark/pull/23634
>>> It's not a regression, though it does relate to correctness, although
>>> somewhat niche.
>>> TD, Jose et al, is this a Blocker? and is the fix probably reliable
>>> enough to commit now?
>>>
>>> On Mon, Jan 28, 2019 at 10:59 AM Sandeep Katta
>>>  wrote:
>>> >
>>> > I feel this https://issues.apache.org/jira/browse/SPARK-26154 bug
>>> should be fixed in this release as it is related to data correctness
>>> >
>>> > On Mon, 28 Jan 2019 at 17:55, Takeshi Yamamuro 
>>> wrote:
>>> >>
>>> >> Hi, all
>>> >>
>>> >> I checked the two issues below had been resolved and there is no
>>> blocker for branch-2.3 now, so I'll start prepare RC2 tomorrow.
>>> >> https://issues.apache.org/jira/browse/SPARK-26682
>>> >> https://issues.apache.org/jira/browse/SPARK-26709
>>> >>
>>> >> If there are some blockers and critical issues in branch-2.3, please
>>> let me know.
>>> >>
>>> >> Best,
>>> >> Takeshi
>>> >>
>>> >> On Thu, Jan 24, 2019 at 10:06 AM Takeshi Yamamuro <
>>> linguin@gmail.com> wrote:
>>> >>>
>>> >>> Thanks, all.
>>> >>>
>>> >>> I'll start a new vote as rc2 after the two issues above resolved.
>>> >>>
>>> >>> Best,
>>> >>> Takeshi
>>> >>>
>>> >>>
>>> >>> On Thu, Jan 24, 2019 at 7:59 AM Xiao Li 
>>> wrote:
>>> 
>>>  -1
>>> 
>>>  https://issues.apache.org/jira/browse/SPARK-26709 is another
>>> blocker ticket that returns incorrect results.
>>> 
>>> 
>>>  Marcelo Vanzin  于2019年1月23日周三
>>> 下午12:01写道：
>>> >
>>> > -1 too.
>>> >
>>> > I just upgraded https://issues.apache.org/jira/browse/SPARK-26682
>>> to
>>> > blocker. It's a small fix and we should make it in 2.3.3.
>>> >
>>> > On Thu, Jan 17, 2019 at 6:49 PM Takeshi Yamamuro <
>>> linguin@gmail.com> wrote:
>>> > >
>>> > > Please vote on releasing the following candidate as Apache Spark
>>> version 2.3.3.
>>> > >
>>> > > The vote is open until January 20 8:00PM (PST) and passes if a
>>> majority +1 PMC votes are cast, with
>>> > > a minimum of 3 +1 votes.
>>> > >
>>> > > [ ] +1 Release this package as Apache Spark 2.3.3
>>> > > [ ] -1 Do not release this package because ...
>>> > >
>>> > > To learn more about Apache Spark, please see
>>> http://spark.apache.org/
>>> > >
>>> > > The tag to be voted on is v2.3.3-rc1 (commit
>>> b5ea9330e3072e99841270b10dc1d2248127064b):
>>> > > https://github.com/apache/spark/tree/v2.3.3-rc1
>>> > >
>>> > > The release files, including signatures, digests, etc. can be
>>> found at:
>>> > > https://dist.apache.org/repos/dist/dev/spark/v2.3.3-rc1-bin/
>>> > >
>>> > > Signatures used for Spark RCs can be found in this file:
>>> > > https://dist.apache.org/repos/dist/dev/spark/KEYS
>>> > >
>>> > > The staging repository for this release can be found at:
>>> > >
>>> https://repository.apache.org/content/repositories/orgapachespark-1297
>>> > >
>>> > > The documentation corresponding to this release can be found at:
>>> > > https://dist.apache.org/repos/dist/dev/spark/v2.3.3-rc1-docs/
>>> > >
>>> > > The list of bug fixes going into 2.3.3 can be found at the
>>> following URL:
>>> > > https://issues.apache.org/jira/projects/SPARK/versions/12343759
>>> > >
>>> > > FAQ
>>> > >
>>> > > =
>>> > > How can I help test this release?
>>> > > =
>>> > >
>>> > > If you are a Spark user, you can help us test this release by
>>> taking
>>> > > an existing Spark workload and running on this release
>>> candidate, then
>>> > > reporting any regressions.
>>> > >
>>> > > If you're working in PySpark you can set up a virtual env and
>>> install
>>> > > the current RC and see if anything important breaks, in the
>>> Java/Scala
>>> > > you can add the stagin

Re: Welcome Jose Torres as a Spark committer

2019-01-30 Thread Bryan Cutler

Congrats Jose!

On Tue, Jan 29, 2019, 10:48 AM Shixiong Zhu  Hi all,
>
> The Apache Spark PMC recently added Jose Torres as a committer on the
> project. Jose has been a major contributor to Structured Streaming. Please
> join me in welcoming him!
>
> Best Regards,
>
> Shixiong Zhu
>
>

Re: Welcome Jose Torres as a Spark committer

2019-01-30 Thread Stavros Kontopoulos

Congrats Jose!

On Wed, Jan 30, 2019 at 10:44 AM Gabor Somogyi 
wrote:

> Congrats Jose!
>
> BR,
> G
>
> On Wed, Jan 30, 2019 at 9:05 AM Nuthan Reddy 
> wrote:
>
>> Congrats Jose,
>>
>> Regards,
>> Nuthan Reddy
>>
>>
>>
>> On Wed, Jan 30, 2019 at 1:22 PM Marco Gaido 
>> wrote:
>>
>>> Congrats, Jose!
>>>
>>> Bests,
>>> Marco
>>>
>>> Il giorno mer 30 gen 2019 alle ore 03:17 JackyLee  ha
>>> scritto:
>>>
 Congrats, Joe!

 Best,
 Jacky



 --
 Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

 -
 To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Purpose of broadcast timeout

2019-01-30 Thread Ryan Blue

At Netflix, we disable the broadcast timeout in our defaults.

I found that it never helped catch problems. With lazy evaluation, I think
it is reasonable for a table that should be broadcast to take a long time
to build. Just because a join uses a subset or aggregation of a large table
or requires a join itself, doesn't mean that it isn't better for the final
plan to broadcast the data.

I'm not sure that a timeout for `sparkContext.broadcast` would be helpful
either. What bad behavior would this catch?

Let's just remove the timeout entirely, or disable it by default.

On Wed, Jan 30, 2019 at 9:27 AM Justin Uang  wrote:

> Hi all,
>
> We have noticed a lot of broadcast timeouts on our pipelines, and from
> some inspection, it seems that they happen when I have two threads trying
> to save two different DataFrames. We use the FIFO scheduler, so if I launch
> a job that needs all the executors, the second DataFrame's collect on the
> broadcast side is guaranteed to take longer than 5 minutes, and will throw.
>
> My question is why do we have a timeout on a collect when broadcasting? It
> seems silly that we have a small default timeout on something that is
> influenced by contention on the cluster. We are basically saying that all
> broadcast jobs need to finish in 5 minutes, regardless of our scheduling
> policy on the cluster.
>
> I'm curious about the original intention of the broadcast timeout. Perhaps
> is the broadcast timeout really meant to be a timeout on
> sparkContext.broadcast, instead of the child.executeCollectIterator()? In
> that case, would it make sense to move the timeout to wrap only
> sparkContext.broadcast?
>
> Best,
>
> Justin
>

-- 
Ryan Blue
Software Engineer
Netflix

Purpose of broadcast timeout

2019-01-30 Thread Justin Uang

Hi all,

We have noticed a lot of broadcast timeouts on our pipelines, and from some
inspection, it seems that they happen when I have two threads trying to
save two different DataFrames. We use the FIFO scheduler, so if I launch a
job that needs all the executors, the second DataFrame's collect on the
broadcast side is guaranteed to take longer than 5 minutes, and will throw.

My question is why do we have a timeout on a collect when broadcasting? It
seems silly that we have a small default timeout on something that is
influenced by contention on the cluster. We are basically saying that all
broadcast jobs need to finish in 5 minutes, regardless of our scheduling
policy on the cluster.

I'm curious about the original intention of the broadcast timeout. Perhaps
is the broadcast timeout really meant to be a timeout on
sparkContext.broadcast, instead of the child.executeCollectIterator()? In
that case, would it make sense to move the timeout to wrap only
sparkContext.broadcast?

Best,

Justin

Re: [VOTE] [SPARK-25994] SPIP: DataFrame-based Property Graphs, Cypher Queries, and Algorithms

2019-01-30 Thread Xiao Li

Change my vote from +1 to ++1

Xiangrui Meng  于2019年1月30日周三 上午6:20写道：

> Correction: +0 vote doesn't mean "Don't really care". Thanks Ryan for the
> offline reminder! Below is the Apache official interpretation
> 
> of fraction values:
>
> The in-between values are indicative of how strongly the voting individual
> feels. Here are some examples of fractional votes and ways in which they
> might be intended and interpreted:
> +0: 'I don't feel strongly about it, but I'm okay with this.'
> -0: 'I won't get in the way, but I'd rather we didn't do this.'
> -0.5: 'I don't like this idea, but I can't find any rational justification
> for my feelings.'
> ++1: 'Wow! I like this! Let's do it!'
> -0.9: 'I really don't like this, but I'm not going to stand in the way if
> everyone else wants to go ahead with it.'
> +0.9: 'This is a cool idea and i like it, but I don't have time/the skills
> necessary to help out.'
>
>
> On Wed, Jan 30, 2019 at 12:31 AM Martin Junghanns
>  wrote:
>
>> Hi Dongjoon,
>>
>> Thanks for the hint! I updated the SPIP accordingly.
>>
>> I also changed the access permissions for the SPIP and design sketch docs
>> so that anyone can comment.
>>
>> Best,
>>
>> Martin
>> On 29.01.19 18:59, Dongjoon Hyun wrote:
>>
>> Hi, Xiangrui Meng.
>>
>> +1 for the proposal.
>>
>> However, please update the following section for this vote. As we see, it
>> seems to be inaccurate because today is Jan. 29th. (Almost February).
>> (Since I cannot comment on the SPIP, I replied here.)
>>
>> Q7. How long will it take?
>>
>>-
>>
>>If accepted by the community by the end of December 2018, we predict
>>to be feature complete by mid-end March, allowing for QA during April 
>> 2019,
>>making the SPIP part of the next major Spark release (3.0, ETA May, 2019).
>>
>> Bests,
>> Dongjoon.
>>
>> On Tue, Jan 29, 2019 at 8:52 AM Xiao Li  wrote:
>>
>>> +1
>>>
>>> Jules Damji  于2019年1月29日周二 上午8:14写道：
>>>
 +1 (non-binding)
 (Heard their proposed tech-talk at Spark + A.I summit in London. Well
 attended & well received.)

 —
 Sent from my iPhone
 Pardon the dumb thumb typos :)

 On Jan 29, 2019, at 7:30 AM, Denny Lee  wrote:

 +1

 yay - let's do it!

 On Tue, Jan 29, 2019 at 6:28 AM Xiangrui Meng  wrote:

> Hi all,
>
> I want to call for a vote of SPARK-25994
> . It introduces a
> new DataFrame-based component to Spark, which supports property graph
> construction, Cypher queries, and graph algorithms. The proposal
> 
> was made available on user@
> 
> and dev@
> 
>  to
> collect input. You can also find a sketch design doc attached to
> SPARK-26028 .
>
> The vote will be up for the next 72 hours. Please reply with your vote:
>
> +1: Yeah, let's go forward and implement the SPIP.
> +0: Don't really care.
> -1: I don't think this is a good idea because of the following
> technical reasons.
>
> Best,
> Xiangrui
>

Re: [VOTE] [SPARK-25994] SPIP: DataFrame-based Property Graphs, Cypher Queries, and Algorithms

2019-01-30 Thread Xiangrui Meng

Correction: +0 vote doesn't mean "Don't really care". Thanks Ryan for the
offline reminder! Below is the Apache official interpretation

of fraction values:

The in-between values are indicative of how strongly the voting individual
feels. Here are some examples of fractional votes and ways in which they
might be intended and interpreted:
+0: 'I don't feel strongly about it, but I'm okay with this.'
-0: 'I won't get in the way, but I'd rather we didn't do this.'
-0.5: 'I don't like this idea, but I can't find any rational justification
for my feelings.'
++1: 'Wow! I like this! Let's do it!'
-0.9: 'I really don't like this, but I'm not going to stand in the way if
everyone else wants to go ahead with it.'
+0.9: 'This is a cool idea and i like it, but I don't have time/the skills
necessary to help out.'


On Wed, Jan 30, 2019 at 12:31 AM Martin Junghanns
 wrote:

> Hi Dongjoon,
>
> Thanks for the hint! I updated the SPIP accordingly.
>
> I also changed the access permissions for the SPIP and design sketch docs
> so that anyone can comment.
>
> Best,
>
> Martin
> On 29.01.19 18:59, Dongjoon Hyun wrote:
>
> Hi, Xiangrui Meng.
>
> +1 for the proposal.
>
> However, please update the following section for this vote. As we see, it
> seems to be inaccurate because today is Jan. 29th. (Almost February).
> (Since I cannot comment on the SPIP, I replied here.)
>
> Q7. How long will it take?
>
>-
>
>If accepted by the community by the end of December 2018, we predict
>to be feature complete by mid-end March, allowing for QA during April 2019,
>making the SPIP part of the next major Spark release (3.0, ETA May, 2019).
>
> Bests,
> Dongjoon.
>
> On Tue, Jan 29, 2019 at 8:52 AM Xiao Li  wrote:
>
>> +1
>>
>> Jules Damji  于2019年1月29日周二 上午8:14写道：
>>
>>> +1 (non-binding)
>>> (Heard their proposed tech-talk at Spark + A.I summit in London. Well
>>> attended & well received.)
>>>
>>> —
>>> Sent from my iPhone
>>> Pardon the dumb thumb typos :)
>>>
>>> On Jan 29, 2019, at 7:30 AM, Denny Lee  wrote:
>>>
>>> +1
>>>
>>> yay - let's do it!
>>>
>>> On Tue, Jan 29, 2019 at 6:28 AM Xiangrui Meng  wrote:
>>>
 Hi all,

 I want to call for a vote of SPARK-25994
 . It introduces a
 new DataFrame-based component to Spark, which supports property graph
 construction, Cypher queries, and graph algorithms. The proposal
 
 was made available on user@
 
 and dev@
 
  to
 collect input. You can also find a sketch design doc attached to
 SPARK-26028 .

 The vote will be up for the next 72 hours. Please reply with your vote:

 +1: Yeah, let's go forward and implement the SPIP.
 +0: Don't really care.
 -1: I don't think this is a good idea because of the following
 technical reasons.

 Best,
 Xiangrui

>>>

Re: Self join

2019-01-30 Thread Marco Gaido

Hi all,

this thread got a bit stuck. Hence, if there are no objections, I'd go
ahead with a design doc describing the solution/workaround I mentioned
before. Any concerns?
Thanks,
Marco

Il giorno gio 13 dic 2018 alle ore 18:15 Ryan Blue  ha
scritto:

> Thanks for the extra context, Marco. I thought you were trying to propose
> a solution.
>
> On Thu, Dec 13, 2018 at 2:45 AM Marco Gaido 
> wrote:
>
>> Hi Ryan,
>>
>> My goal with this email thread is to discuss with the community if there
>> are better ideas (as I was told many other people tried to address this).
>> I'd consider this as a brainstorming email thread. Once we have a good
>> proposal, then we can go ahead with a SPIP.
>>
>> Thanks,
>> Marco
>>
>> Il giorno mer 12 dic 2018 alle ore 19:13 Ryan Blue 
>> ha scritto:
>>
>>> Marco,
>>>
>>> I'm actually asking for a design doc that clearly states the problem and
>>> proposes a solution. This is a substantial change and probably should be an
>>> SPIP.
>>>
>>> I think that would be more likely to generate discussion than referring
>>> to PRs or a quick paragraph on the dev list, because the only people that
>>> are looking at it now are the ones already familiar with the problem.
>>>
>>> rb
>>>
>>> On Wed, Dec 12, 2018 at 2:05 AM Marco Gaido 
>>> wrote:
>>>
 Thank you all for your answers.

 @Ryan Blue  sure, let me state the problem more
 clearly: imagine you have 2 dataframes with a common lineage (for instance
 one is derived from the other by some filtering or anything you prefer).
 And imagine you want to join these 2 dataframes. Currently, there is a fix
 by Reynold which deduplicates the join condition in case the condition is
 an equality one (please notice that in this case, it doesn't matter which
 one is on the left and which one on the right). But if the condition
 involves other comparisons, such as a ">" or a "<", this would result in an
 analysis error, because the attributes on both sides are the same (eg. you
 have the same id#3 attribute on both sides), and you cannot deduplicate
 them blindly as which one is on a specific side matters.

 @Reynold Xin  my proposal was to add a dataset id
 in the metadata of each attribute, so that in this case we can distinguish
 from which dataframe the attribute is coming from, ie. having the
 DataFrames `df1` and `df2` where `df2` is derived from `df1`,
 `df1.join(df2, df1("a") > df2("a"))` could be resolved because we would
 know that the first attribute is taken from `df1` and so it has to be
 resolved using it and the same for the other. But I am open to any approach
 to this problem, if other people have better ideas/suggestions.

 Thanks,
 Marco

 Il giorno mar 11 dic 2018 alle ore 18:31 Jörn Franke <
 jornfra...@gmail.com> ha scritto:

> I don’t know your exact underlying business problem,  but maybe a
> graph solution, such as Spark Graphx meets better your requirements.
> Usually self-joins are done to address some kind of graph problem (even if
> you would not describe it as such) and is for these kind of problems much
> more efficient.
>
> Am 11.12.2018 um 12:44 schrieb Marco Gaido :
>
> Hi all,
>
> I'd like to bring to the attention of a more people a problem which
> has been there for long, ie, self joins. Currently, we have many troubles
> with them. This has been reported several times to the community and seems
> to affect many people, but as of now no solution has been accepted for it.
>
> I created a PR some time ago in order to address the problem (
> https://github.com/apache/spark/pull/21449), but Wenchen mentioned he
> tried to fix this problem too but so far no attempt was successful because
> there is no clear semantic (
> https://github.com/apache/spark/pull/21449#issuecomment-393554552).
>
> So I'd like to propose to discuss here which is the best approach for
> tackling this issue, which I think would be great to fix for 3.0.0, so if
> we decide to introduce breaking changes in the design, we can do that.
>
> Thoughts on this?
>
> Thanks,
> Marco
>
>
>>>
>>> --
>>> Ryan Blue
>>> Software Engineer
>>> Netflix
>>>
>>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>

Re: Welcome Jose Torres as a Spark committer

2019-01-30 Thread Gabor Somogyi

Congrats Jose!

BR,
G

On Wed, Jan 30, 2019 at 9:05 AM Nuthan Reddy 
wrote:

> Congrats Jose,
>
> Regards,
> Nuthan Reddy
>
>
>
> On Wed, Jan 30, 2019 at 1:22 PM Marco Gaido 
> wrote:
>
>> Congrats, Jose!
>>
>> Bests,
>> Marco
>>
>> Il giorno mer 30 gen 2019 alle ore 03:17 JackyLee  ha
>> scritto:
>>
>>> Congrats, Joe!
>>>
>>> Best,
>>> Jacky
>>>
>>>
>>>
>>> --
>>> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
>>>
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>

Re: [VOTE] [SPARK-25994] SPIP: DataFrame-based Property Graphs, Cypher Queries, and Algorithms

2019-01-30 Thread Martin Junghanns


Hi Dongjoon,

Thanks for the hint! I updated the SPIP accordingly.

I also changed the access permissions for the SPIP and design sketch 
docs so that anyone can comment.


Best,

Martin

On 29.01.19 18:59, Dongjoon Hyun wrote:

Hi, Xiangrui Meng.

+1 for the proposal.

However, please update the following section for this vote. As we see, 
it seems to be inaccurate because today is Jan. 29th. (Almost February).

(Since I cannot comment on the SPIP, I replied here.)

Q7. How long will it take?

 *

If accepted by the community by the end of December 2018, we
predict to be feature complete by mid-end March, allowing for QA
during April 2019, making the SPIP part of the next major Spark
release (3.0, ETA May, 2019).

Bests,
Dongjoon.

On Tue, Jan 29, 2019 at 8:52 AM Xiao Li > wrote:


+1

Jules Damji mailto:dmat...@comcast.net>>
于2019年1月29日周二 上午8:14写道：

+1 (non-binding)
(Heard their proposed tech-talk at Spark + A.I summit in
London. Well attended & well received.)

—
Sent from my iPhone
Pardon the dumb thumb typos :)

On Jan 29, 2019, at 7:30 AM, Denny Lee mailto:denny.g@gmail.com>> wrote:


+1

yay - let's do it!

On Tue, Jan 29, 2019 at 6:28 AM Xiangrui Meng
mailto:men...@gmail.com>> wrote:

Hi all,

I want to call for a vote of SPARK-25994
. It
introduces a new DataFrame-based component to Spark,
which supports property graph construction, Cypher
queries, and graph algorithms. The proposal


was made available on user@


and dev@


 to
collect input. You can also find a sketch design doc
attached to SPARK-26028
.

The vote will be up for the next 72 hours. Please reply
with your vote:

+1: Yeah, let's go forward and implement the SPIP.
+0: Don't really care.
-1: I don't think this is a good idea because of the
following technical reasons.

Best,
Xiangrui

Re: Welcome Jose Torres as a Spark committer

2019-01-30 Thread Nuthan Reddy

Congrats Jose,

Regards,
Nuthan Reddy



On Wed, Jan 30, 2019 at 1:22 PM Marco Gaido  wrote:

> Congrats, Jose!
>
> Bests,
> Marco
>
> Il giorno mer 30 gen 2019 alle ore 03:17 JackyLee  ha
> scritto:
>
>> Congrats, Joe!
>>
>> Best,
>> Jacky
>>
>>
>>
>> --
>> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>

Re: [VOTE] [SPARK-25994] SPIP: DataFrame-based Property Graphs, Cypher Queries, and Algorithms

Re: [VOTE] Release Apache Spark 2.3.3 (RC1)

Re: Welcome Jose Torres as a Spark committer

Re: Welcome Jose Torres as a Spark committer

Re: Purpose of broadcast timeout

Purpose of broadcast timeout

Re: [VOTE] [SPARK-25994] SPIP: DataFrame-based Property Graphs, Cypher Queries, and Algorithms

Re: [VOTE] [SPARK-25994] SPIP: DataFrame-based Property Graphs, Cypher Queries, and Algorithms

Re: Self join

Re: Welcome Jose Torres as a Spark committer

Re: [VOTE] [SPARK-25994] SPIP: DataFrame-based Property Graphs, Cypher Queries, and Algorithms

Re: Welcome Jose Torres as a Spark committer

12 matches

Site Navigation

Mail list logo

Footer information