Re: Ask for ARM CI for spark

2019-09-22 Thread bo zhaobo
Hi Guys,

Recently, we are trying to test pyspark on ARM, we found some issue but
have no idea about them. Could you please have a look if you are free?
Thanks.

There are two issues:
1. The first one looks like a arm performance issue, the test job in a
pyspark test doesn't fully finish when exec assert check. So we change the
source code on our local env to test, they will pass.  For this issue, we
opened a JIRA issue [1]. If you guys are free, please help it. Thanks.
2. The second one looks like a spark internal issue, when we test
"pyspark.mllib.tests.test_streaming_algorithms:StreamingLinearRegressionWithTests.test_train_prediction",
it will fail as the "condition" function.We tried to deep into it and found
the predicted value is still [0. 0. .0.], eventhough we wait for a long
time on ARM testing env. That's the main cause I think. And we failed to
debug into which step is wrong. Could you please help to figure it out? I
upload the test log after I inserted some 'printf' into the 'func' function
of  the testcase function. I tried on ARM and X86. ARM log is [2], X86 log
is [3]. They are the same testing env except different ARCH.

Thanks, if you are free, please help us.

Best Regards

[1] https://issues.apache.org/jira/browse/SPARK-29205
[2] https://etherpad.net/p/pyspark-arm
[3] https://etherpad.net/p/pyspark-x86

[image: Mailtrack]

Sender
notified by
Mailtrack

19/09/23
上午11:53:29

Tianhua huang  于2019年9月19日周四 上午10:59写道:

> @Dongjoon Hyun  ,
>
> Sure, and I have update the JIRA already :)
> https://issues.apache.org/jira/browse/SPARK-29106
> If anything missed, please let me know, thank you.
>
> On Thu, Sep 19, 2019 at 12:44 AM Dongjoon Hyun 
> wrote:
>
>> Hi, Tianhua.
>>
>> Could you summarize the detail on the JIRA once more?
>> It will be very helpful for the community. Also, I've been waiting on
>> that JIRA. :)
>>
>> Bests,
>> Dongjoon.
>>
>>
>> On Mon, Sep 16, 2019 at 11:48 PM Tianhua huang 
>> wrote:
>>
>>> @shane knapp  thank you very much, I opened an
>>> issue for this https://issues.apache.org/jira/browse/SPARK-29106, we
>>> can tall the details in it :)
>>> And we will prepare an arm instance today and will send the info to your
>>> email later.
>>>
>>> On Tue, Sep 17, 2019 at 4:40 AM Shane Knapp  wrote:
>>>
 @Tianhua huang  sure, i think we can get
 something sorted for the short-term.

 all we need is ssh access (i can provide an ssh key), and i can then
 have our jenkins master launch a remote worker on that instance.

 instance setup, etc, will be up to you.  my support for the time being
 will be to create the job and 'best effort' for everything else.

 this should get us up and running asap.

 is there an open JIRA for jenkins/arm test support?  we can move the
 technical details about this idea there.

 On Sun, Sep 15, 2019 at 9:03 PM Tianhua huang <
 huangtianhua...@gmail.com> wrote:

> @Sean Owen  , so sorry to reply late, we had a
> Mid-Autumn holiday:)
>
> If you hope to integrate ARM CI to amplab jenkins, we can offer the
> arm instance, and then the ARM job will run together with other x86 jobs,
> so maybe there is a guideline to do this? @shane knapp
>   would you help us?
>
> On Thu, Sep 12, 2019 at 9:36 PM Sean Owen  wrote:
>
>> I don't know what's involved in actually accepting or operating those
>> machines, so can't comment there, but in the meantime it's good that you
>> are running these tests and can help report changes needed to keep it
>> working with ARM. I would continue with that for now.
>>
>> On Wed, Sep 11, 2019 at 10:06 PM Tianhua huang <
>> huangtianhua...@gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> For the whole work process of spark ARM CI, we want to make 2 things
>>> clear.
>>>
>>> The first thing is:
>>> About spark ARM CI, now we have two periodic jobs, one job[1] based
>>> on commit[2](which already fixed the replay tests failed issue[3], we 
>>> made
>>> a new test branch based on date 09-09-2019), the other job[4] based on
>>> spark master.
>>>
>>> The first job we test on the specified branch to prove that our ARM
>>> CI is good and stable.
>>> The second job checks spark master every day, then we can find
>>> whether the latest commits affect the ARM CI. According to the build
>>> history and result, it shows that some problems are easier to find on 
>>> ARM
>>> like SPARK-28770 ,
>>> and it also shows that we would make efforts to trace and figure them
>>> out, till now we have found and fixed several problems[5][6][7], thanks
>>> everyone of the community :). And we believe that A

Re: [DISCUSS] Spark 2.5 release

2019-09-22 Thread Hyukjin Kwon
+1 for Matei's as well.

On Sun, 22 Sep 2019, 14:59 Marco Gaido,  wrote:

> I agree with Matei too.
>
> Thanks,
> Marco
>
> Il giorno dom 22 set 2019 alle ore 03:44 Dongjoon Hyun <
> dongjoon.h...@gmail.com> ha scritto:
>
>> +1 for Matei's suggestion!
>>
>> Bests,
>> Dongjoon.
>>
>> On Sat, Sep 21, 2019 at 5:44 PM Matei Zaharia 
>> wrote:
>>
>>> If the goal is to get people to try the DSv2 API and build DSv2 data
>>> sources, can we recommend the 3.0-preview release for this? That would get
>>> people shifting to 3.0 faster, which is probably better overall compared to
>>> maintaining two major versions. There’s not that much else changing in 3.0
>>> if you already want to update your Java version.
>>>
>>> On Sep 21, 2019, at 2:45 PM, Ryan Blue 
>>> wrote:
>>>
>>> > If you insist we shouldn't change the unstable temporary API in 3.x .
>>> . .
>>>
>>> Not what I'm saying at all. I said we should carefully consider whether
>>> a breaking change is the right decision in the 3.x line.
>>>
>>> All I'm suggesting is that we can make a 2.5 release with the feature
>>> and an API that is the same as the one in 3.0.
>>>
>>> > I also don't get this backporting a giant feature to 2.x line
>>>
>>> I am planning to do this so we can use DSv2 before 3.0 is released. Then
>>> we can have a source implementation that works in both 2.x and 3.0 to make
>>> the transition easier. Since I'm already doing the work, I'm offering to
>>> share it with the community.
>>>
>>>
>>> On Sat, Sep 21, 2019 at 2:36 PM Reynold Xin  wrote:
>>>
 Because for example we'd need to move the location of InternalRow,
 breaking the package name. If you insist we shouldn't change the unstable
 temporary API in 3.x to maintain compatibility with 3.0, which is totally
 different from my understanding of the situation when you exposed it, then
 I'd say we should gate 3.0 on having a stable row interface.

 I also don't get this backporting a giant feature to 2.x line ... as
 suggested by others in the thread, DSv2 would be one of the main reasons
 people upgrade to 3.0. What's so special about DSv2 that we are doing this?
 Why not abandoning 3.0 entirely and backport all the features to 2.x?



 On Sat, Sep 21, 2019 at 2:31 PM, Ryan Blue  wrote:

> Why would that require an incompatible change?
>
> We *could* make an incompatible change and remove support for
> InternalRow, but I think we would want to carefully consider whether that
> is the right decision. And in any case, we would be able to keep 2.5 and
> 3.0 compatible, which is the main goal.
>
> On Sat, Sep 21, 2019 at 2:28 PM Reynold Xin 
> wrote:
>
> How would you not make incompatible changes in 3.x? As discussed the
> InternalRow API is not stable and needs to change.
>
> On Sat, Sep 21, 2019 at 2:27 PM Ryan Blue  wrote:
>
> > Making downstream to diverge their implementation heavily between
> minor versions (say, 2.4 vs 2.5) wouldn't be a good experience
>
> You're right that the API has been evolving in the 2.x line. But, it
> is now reasonably stable with respect to the current feature set and we
> should not need to break compatibility in the 3.x line. Because we have
> reached our goals for the 3.0 release, we can backport at least those
> features to 2.x and confidently have an API that works in both a 2.x
> release and is compatible with 3.0, if not 3.1 and later releases as well.
>
> > I'd rather say preparation of Spark 2.5 should be started after
> Spark 3.0 is officially released
>
> The reason I'm suggesting this is that I'm already going to do the
> work to backport the 3.0 release features to 2.4. I've been asked by
> several people when DSv2 will be released, so I know there is a lot of
> interest in making this available sooner than 3.0. If I'm already doing 
> the
> work, then I'd be happy to share that with the community.
>
> I don't see why 2.5 and 3.0 are mutually exclusive. We can work on 2.5
> while preparing the 3.0 preview and fixing bugs. For DSv2, the work is
> about complete so we can easily release the same set of features and API 
> in
> 2.5 and 3.0.
>
> If we decide for some reason to wait until after 3.0 is released, I
> don't know that there is much value in a 2.5. The purpose is to be a step
> toward 3.0, and releasing that step after 3.0 doesn't seem helpful to me.
> It also wouldn't get these features out any sooner than 3.0, as a 2.5
> release probably would, given the work needed to validate the incompatible
> changes in 3.0.
>
> > DSv2 change would be the major backward incompatibility which Spark
> 2.x users may hesitate to upgrade
>
> As I pointed out, DSv2 has been changing in the 2.x line, so this is
> expected. I don't think it will need incompatible changes in the 3.x line.
>
>