Status of Spark testing on ARM64

2019-11-25 Thread Tianhua huang
Hi all,
I will give you some informations about ARM CI of Spark:

Our team and community are working on build/test Spark master on ARM64
server, after find and fix some issues[1], we have integrated two ARM
testing jobs[2] to community CI(AMPLAB Jenkins),
they run as daily job and have been stablely running for few weeks, and the
two ARM testing jobs are success generally.
Thanks Sean Owen, Shane Knapp, Dongjoon Hyun and community to help us :)

If you are interested, please have a try:)  Before
https://github.com/apache/spark/pull/26636 merged, you have to download and
maven install org.openlabtesting.leveldbjni:leveldbjni-all.1.8 using
commands:
wget
https://repo1.maven.org/maven2/org/openlabtesting/leveldbjni/leveldbjni-all/1.8/leveldbjni-all-1.8.jar

mvn install:install-file -DgroupId=org.fusesource.leveldbjni
-DartifactId=leveldbjni-all -Dversion=1.8 -Dpackaging=jar
-Dfile=leveldbjni-all-1.8.jar
Then, you can build and test Spark on ARM64 server.

If you have any questions, please don't hesitate to contact me, thanks all!

[1]:
https://issues.apache.org/jira/browse/SPARK-28770 (
https://github.com/apache/spark/pull/25673)
https://issues.apache.org/jira/browse/SPARK-28519 (
https://github.com/apache/spark/pull/25279)
https://issues.apache.org/jira/browse/SPARK-28433 (
https://github.com/apache/spark/pull/25186)
https://issues.apache.org/jira/browse/SPARK-28467 (
https://github.com/apache/spark/pull/25864)
https://issues.apache.org/jira/browse/SPARK-29286 (
https://github.com/apache/spark/pull/26021)
https://issues.apache.org/jira/browse/SPARK-29286
   (
https://github.com/apache/spark/pull/26636) --- this one is in progress

[2]:
   https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-arm/
 The job spark-master-test-maven-arm same with community x86 job
https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-2.7/
,
 It runs all java/scala tests, the test number is about 21,112.
   https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-python-arm/
 The job spark-master-test-python-arm runs pyspark tests with python3.6.


Re: Ask for ARM CI for spark

2019-11-17 Thread Tianhua huang
We can talk about this later, but I have to update some things:)

- It (largely) worked previously
  --- But no one sure about this before the arm testing, and it can't be
found anywhere, specify officially will make it more clear
- I think you're also saying you don't have 100% tests passing anyway,
though probably just small issues
  --- The maven and python tests are 100% passing, see
https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-arm/ and
https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-python-arm/
- It does not seem to merit a special announcement from the PMC among
the 2000+ changes in Spark 3
  --- It's import to users, I believe it deserves

On Mon, Nov 18, 2019 at 10:06 AM Sean Owen  wrote:

> Same response as before:
>
> - It is in the list of resolved JIRAs, of course
> - It (largely) worked previously
> - I think you're also saying you don't have 100% tests passing anyway,
> though probably just small issues
> - It does not seem to merit a special announcement from the PMC among
> the 2000+ changes in Spark 3
> - You are welcome to announce (on the project's user@ list if you
> like) whatever you want. Obviously, this is already well advertised on
> dev@
>
> I think you are asking for what borders on endorsement, and no that
> doesn't sound appropriate. Please just announce whatever you like as
> suggested.
>
> Sean
>
> On Sun, Nov 17, 2019 at 8:01 PM Tianhua huang 
> wrote:
> >
> > @Sean Owen,
> > I'm afraid I don't agree with you this time, I still remember no one can
> tell me whether Spark supports ARM or how much Spark can support ARM when I
> asked this first time on Dev@,  you're very kind and told me to build and
> test on ARM locally and so sorry I think you were not sure much about this
> at that moment, right? Then I and my team work with community, we
> found/fixed several issues, integrate arm jobs into AMPLAB Jenkins, and the
> daily jobs has been stablely running for few weeks... after these efforts
> why not announce this officially in Spark releasenote? I believe after this
> everyone will know Spark is fully testing on ARM on community CI, Spark
> supports ARM basically, it's amazing and this will be very helpful. So what
> do you think? Or what are you worrying about?
>


Re: Ask for ARM CI for spark

2019-11-17 Thread Tianhua huang
@Sean Owen ,
I'm afraid I don't agree with you this time, I still remember no one can
tell me whether Spark supports ARM or how much Spark can support ARM when I
asked this first time on Dev@,  you're very kind and told me to build and
test on ARM locally and so sorry I think you were not sure much about this
at that moment, right? Then I and my team work with community, we
found/fixed several issues, integrate arm jobs into AMPLAB Jenkins, and the
daily jobs has been stablely running for few weeks... after these efforts
why not announce this officially in Spark releasenote? I believe after this
everyone will know Spark is fully testing on ARM on community CI, Spark
supports ARM basically, it's amazing and this will be very helpful. So what
do you think? Or what are you worrying about?

On Mon, Nov 18, 2019 at 2:28 AM Steve Loughran  wrote:

> The ASF PR team would like something like that "Spark now supports ARM" in
> press releases. And don't forget: they do you like to be involved in the
> launch of the final release.
>
> On Fri, Nov 15, 2019 at 9:46 AM bo zhaobo 
> wrote:
>
>> Hi @Sean Owen  ,
>>
>> Thanks for your idea.
>>
>> We may use the bad words to describe our request. That's true that we
>> cannot just say "Spark support ARM from release 3.0.0", and we also cannot
>> say the past releases cannot run on ARM. But the reality is the past
>> releases didn't get a fully test on ARM like the current testing we do. And
>> that's true that current CI system have no resources can fit this kind
>> request(test on ARM).
>>
>> And please try to think, if a user wants to run lastest Spark release on
>> ARM(even the old releases), but community doesn't say that the specific
>> Spark release get testing on ARM. I think the users might think there is a
>> risk run on ARM, if he/she has no choice, they have to run spark on ARM,
>> they will build the CI system by themselves. That's very expensive. Right?
>> But now, community will do the same testing on ARM in the upstream, this
>> will save the users' resources. That's the reason announcing by community
>> in some ways is official and the best. Such as "In XXX release, Spark gets
>> fully testing on ARM" or "In XXX release, Spark community integrated an ARM
>> CI system. ". Once user see that, he/she would be very comfortable to use
>> Spark on ARM. ;-)
>>
>> Thanks for your paitent, we just discuss here, if I do something not
>> good, please feel free to correct and discuss. ;-)
>>
>> Thanks,
>>
>> BR
>>
>> ZhaoBo
>>
>>
>>
>>
>> [image: Mailtrack]
>> <https://mailtrack.io?utm_source=gmail_medium=signature_campaign=signaturevirality5;>
>>  Sender
>> notified by
>> Mailtrack
>> <https://mailtrack.io?utm_source=gmail_medium=signature_campaign=signaturevirality5;>
>>  19/11/15
>> 下午05:43:57
>>
>> Sean Owen  于2019年11月15日周五 下午5:04写道:
>>
>>> I don't think that's true either, not yet. Being JVM-based with no
>>> native code, I just don't even think it would be common to assume it
>>> doesn't work and it apparently has. If you want to announce it, that's
>>> up to you.
>>>
>>> On Fri, Nov 15, 2019 at 3:01 AM Tianhua huang 
>>> wrote:
>>> >
>>> > @Sean Owen,
>>> > Thanks for attention this.
>>> > I agree with you, it's probably not very appropriate to say 'support
>>> arm from 3.0 release'. How about change to the word "Spark community
>>> supports fully tests on arm from 3.0 release"?
>>> > Let's try to think about it from the user's point of view than
>>> developer,users have to know exactly whether spark supports arm well and
>>> wheter spark fully tests on arm. If we specify spark is fully tests on arm,
>>> I believe users will have much more confidence to run spark on arm.
>>> >
>>>
>>


Re: Ask for ARM CI for spark

2019-11-15 Thread Tianhua huang
@Sean Owen,
Thanks for attention this.
I agree with you, it's probably not very appropriate to say 'support arm
from 3.0 release'. How about change to the word "Spark community supports
fully tests on arm from 3.0 release"?
Let's try to think about it from the user's point of view than
developer,users have to know exactly whether spark supports arm well and
wheter spark fully tests on arm. If we specify spark is fully tests on arm,
I believe users will have much more confidence to run spark on arm.

On Fri, Nov 15, 2019 at 4:05 PM Sean Owen  wrote:

> I'm not against it, but the JIRAs will already show that the small
> ARM-related difference like floating-point in log() were resolved.
> Those aren't major enough to highlight as key changes in the 2000+
> resolved. it didn't really not-work before either, as I understand;
> Spark isn't specific to an architecture, so I don't know if that
> situation changed materially in 3.0; it still otherwise ran in 2.x on
> ARM right? It would imply people couldn't use it on ARM previously.
> You can certainly announce you endorse 3.0 as a good release for ARM
> and/or call attention to it on user@.
>
> On Thu, Nov 14, 2019 at 9:01 PM bo zhaobo 
> wrote:
> >
> > Hi @Sean Owen ,
> >
> > Thanks for reply. We know that Spark community has own release date and
> plan. We are happy to follow Spark community. But we think it's great if
> community could add a sentence into the next releasenotes and claim "Spark
> can support Arm from this release." after we finish the test work on ARM.
> That's all. We just want a community official caliber that spark support
> ARM for attracting more users to use spark.
> >
>


Re: Ask for ARM CI for spark

2019-11-14 Thread Tianhua huang
@Sean,
Yes, you are right, we don't have to create a separate release of Spark for
ARM, it's enough to add a releasenote to say that Spark supports
arm architecture.
About the test failure, one or two tests will timeout on our poor
performance arm instance sometimes, now we donate a high performance arm
instance to amplab, and wait shane to build the jobs on it.

On Fri, Nov 15, 2019 at 10:13 AM Sean Owen  wrote:

> I don't quite understand. You are saying tests don't pass yet, so why
> would anyone yet run these tests regularly?
> If it's because the instances aren't fast enough, use bigger instances?
> I don't think anyone would create a separate release of Spark for ARM, no.
> But why would that be necessary?
>
> On Thu, Nov 14, 2019 at 7:28 PM bo zhaobo 
> wrote:
>
>> Hi Spark team,
>>
>> Any ideas about the above email? Thank you.
>>
>> BR
>>
>> ZhaoBo
>>
>>
>> [image: Mailtrack]
>> <https://mailtrack.io?utm_source=gmail_medium=signature_campaign=signaturevirality5;>
>>  Sender
>> notified by
>> Mailtrack
>> <https://mailtrack.io?utm_source=gmail_medium=signature_campaign=signaturevirality5;>
>>  19/11/15
>> 上午09:26:17
>>
>> Tianhua huang  于2019年11月12日周二 下午2:47写道:
>>
>>> Hi all,
>>>
>>> Spark arm jobs have built for some time, and now there are two jobs[1]
>>> spark-master-test-maven-arm
>>> <https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-arm/>
>>> and spark-master-test-python-arm
>>> <https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-python-arm/>,
>>> we can see there are some build failures, but it because of the poor
>>> performance of the arm instance, and now we begin to build spark arm jobs
>>> on other high performance instances, and the build/test are all success, we
>>> plan to donate the instance to amplab later.  According to the build
>>> history, we are very happy to say spark is supported on aarch64 platform,
>>> and I suggest to add this good news into spark-3.0.0 releasenotes. Maybe
>>> community could provide an arm-supported release of spark at the meanwhile?
>>>
>>> [1]
>>> https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-arm/
>>> https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-python-arm/
>>>
>>>
>>> ps: the jira https://issues.apache.org/jira/browse/SPARK-29106 trace
>>> the whole work, thank you very much Shane:)
>>>
>>


Re: Ask for ARM CI for spark

2019-11-11 Thread Tianhua huang
Hi all,

Spark arm jobs have built for some time, and now there are two jobs[1]
spark-master-test-maven-arm
<https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-arm/>
and spark-master-test-python-arm
<https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-python-arm/>,
we can see there are some build failures, but it because of the poor
performance of the arm instance, and now we begin to build spark arm jobs
on other high performance instances, and the build/test are all success, we
plan to donate the instance to amplab later.  According to the build
history, we are very happy to say spark is supported on aarch64 platform,
and I suggest to add this good news into spark-3.0.0 releasenotes. Maybe
community could provide an arm-supported release of spark at the meanwhile?

[1]
https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-arm/
https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-python-arm/

ps: the jira https://issues.apache.org/jira/browse/SPARK-29106 trace the
whole work, thank you very much Shane:)

On Thu, Oct 17, 2019 at 2:52 PM bo zhaobo 
wrote:

> Just Notes: The jira issue link is
> https://issues.apache.org/jira/browse/SPARK-29106
>
>
>
> [image: Mailtrack]
> <https://mailtrack.io?utm_source=gmail_medium=signature_campaign=signaturevirality5;>
>  Sender
> notified by
> Mailtrack
> <https://mailtrack.io?utm_source=gmail_medium=signature_campaign=signaturevirality5;>
>  19/10/17
> 下午02:50:01
>
> Tianhua huang  于2019年10月17日周四 上午10:47写道:
>
>> OK, let's update infos there. Thanks.
>>
>> On Thu, Oct 17, 2019 at 1:52 AM Shane Knapp  wrote:
>>
>>> i totally missed the spark jira from earlier...  let's move the
>>> conversation there!
>>>
>>> On Tue, Oct 15, 2019 at 6:21 PM bo zhaobo 
>>> wrote:
>>>
>>>> Shane, Awaresome! We will try the best to finish the test and the
>>>> requests on the VM recently. Once we finish those things, we will send you
>>>> an email , then we can continue the following things. Thank you very much.
>>>>
>>>> Best Regards,
>>>>
>>>> ZhaoBo
>>>>
>>>> Shane Knapp  于 2019年10月16日周三 上午3:47写道:
>>>>
>>>>> ok!  i'm able to successfully log in to the VM!
>>>>>
>>>>> i also have created a jenkins worker entry:
>>>>> https://amplab.cs.berkeley.edu/jenkins/computer/spark-arm-vm/
>>>>>
>>>>> it's a pretty bare-bones VM, so i have some suggestions/requests
>>>>> before we can actually proceed w/testing.  i will not be able to perform
>>>>> any system configuration, as i don't have the cycles to reverse-engineer
>>>>> the ansible setup and test it all out.
>>>>>
>>>>> * java is not installed, please install the following:
>>>>>   - java8 min version 1.8.0_191
>>>>>   - java11 min version 11.0.1
>>>>>
>>>>> * it appears from the ansible playbook that there are other deps that
>>>>> need to be installed.
>>>>>   - please install all deps
>>>>>   - manually run the tests until they pass
>>>>>
>>>>> * the jenkins user should NEVER have sudo or any root-level access!
>>>>>
>>>>> * once the arm tests pass when manually run, take a snapshot of this
>>>>> image so we can recreate it w/o needing to reinstall everything
>>>>>
>>>>> after that's done i can finish configuring the jenkins worker and set
>>>>> up a build...
>>>>>
>>>>> thanks!
>>>>>
>>>>> shane
>>>>>
>>>>>
>>>>> On Mon, Oct 14, 2019 at 8:34 PM Shane Knapp 
>>>>> wrote:
>>>>>
>>>>>> yes, i will get to that tomorrow.  today was spent cleaning up the
>>>>>> mess from last week.
>>>>>>
>>>>>> On Mon, Oct 14, 2019 at 6:18 PM bo zhaobo <
>>>>>> bzhaojyathousa...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi shane,
>>>>>>>
>>>>>>> That's great news for Amplab is back. ;-) . If possible, could you
>>>>>>> please take several minutes to check the ARM VM is accessible from your
>>>>>>> side? And is there any plan for the whole ARM test integration from
>>>>>>> you?(how about we finish it this month?) Thanks.
>>>>>>>
>>>>>>> Best regards,
>>>>

Re: Ask for ARM CI for spark

2019-09-17 Thread Tianhua huang
@shane knapp  thank you very much, I opened an issue
for this https://issues.apache.org/jira/browse/SPARK-29106, we can tall the
details in it :)
And we will prepare an arm instance today and will send the info to your
email later.

On Tue, Sep 17, 2019 at 4:40 AM Shane Knapp  wrote:

> @Tianhua huang  sure, i think we can get
> something sorted for the short-term.
>
> all we need is ssh access (i can provide an ssh key), and i can then have
> our jenkins master launch a remote worker on that instance.
>
> instance setup, etc, will be up to you.  my support for the time being
> will be to create the job and 'best effort' for everything else.
>
> this should get us up and running asap.
>
> is there an open JIRA for jenkins/arm test support?  we can move the
> technical details about this idea there.
>
> On Sun, Sep 15, 2019 at 9:03 PM Tianhua huang 
> wrote:
>
>> @Sean Owen  , so sorry to reply late, we had a
>> Mid-Autumn holiday:)
>>
>> If you hope to integrate ARM CI to amplab jenkins, we can offer the arm
>> instance, and then the ARM job will run together with other x86 jobs, so
>> maybe there is a guideline to do this? @shane knapp 
>> would you help us?
>>
>> On Thu, Sep 12, 2019 at 9:36 PM Sean Owen  wrote:
>>
>>> I don't know what's involved in actually accepting or operating those
>>> machines, so can't comment there, but in the meantime it's good that you
>>> are running these tests and can help report changes needed to keep it
>>> working with ARM. I would continue with that for now.
>>>
>>> On Wed, Sep 11, 2019 at 10:06 PM Tianhua huang <
>>> huangtianhua...@gmail.com> wrote:
>>>
>>>> Hi all,
>>>>
>>>> For the whole work process of spark ARM CI, we want to make 2 things
>>>> clear.
>>>>
>>>> The first thing is:
>>>> About spark ARM CI, now we have two periodic jobs, one job[1] based on
>>>> commit[2](which already fixed the replay tests failed issue[3], we made a
>>>> new test branch based on date 09-09-2019), the other job[4] based on spark
>>>> master.
>>>>
>>>> The first job we test on the specified branch to prove that our ARM CI
>>>> is good and stable.
>>>> The second job checks spark master every day, then we can find whether
>>>> the latest commits affect the ARM CI. According to the build history and
>>>> result, it shows that some problems are easier to find on ARM like
>>>> SPARK-28770 <https://issues.apache.org/jira/browse/SPARK-28770>, and
>>>> it also shows that we would make efforts to trace and figure them out, till
>>>> now we have found and fixed several problems[5][6][7], thanks everyone of
>>>> the community :). And we believe that ARM CI is very necessary, right?
>>>>
>>>> The second thing is:
>>>> We plan to run the jobs for a period of time, and you can see the
>>>> result and logs from 'build history' of the jobs console, if everything
>>>> goes well for one or two weeks could community accept the ARM CI? or how
>>>> long the periodic jobs to run then our community could have enough
>>>> confidence to accept the ARM CI? As you suggested before, it's good to
>>>> integrate ARM CI to amplab jenkins, we agree that and we can donate the ARM
>>>> instances and then maintain the ARM-related test jobs together with
>>>> community, any thoughts?
>>>>
>>>> Thank you all!
>>>>
>>>> [1]
>>>> http://status.openlabtesting.org/job/spark-unchanged-branch-unit-test-hadoop-2.7-arm64
>>>> [2]
>>>> https://github.com/apache/spark/commit/0ed9fae45769d4b06b8cf8128f462f09ff3d9a72
>>>> [3] https://issues.apache.org/jira/browse/SPARK-28770
>>>> [4]
>>>> http://status.openlabtesting.org/builds?job_name=spark-master-unit-test-hadoop-2.7-arm64
>>>> [5] https://github.com/apache/spark/pull/25186
>>>> [6] https://github.com/apache/spark/pull/25279
>>>> [7] https://github.com/apache/spark/pull/25673
>>>>
>>>>
>>>>
>>>> On Fri, Aug 16, 2019 at 11:24 PM Sean Owen  wrote:
>>>>
>>>>> Yes, I think it's just local caching. After you run the build you
>>>>> should find lots of stuff cached at ~/.m2/repository and it won't download
>>>>> every time.
>>>>>
>>>>> On Fri, Aug 16, 2019 at 3:01 AM bo zhaobo 
>>>>> wrote:
>>>>>
>>>>>> 

Re: Ask for ARM CI for spark

2019-09-15 Thread Tianhua huang
@Sean Owen  , so sorry to reply late, we had a Mid-Autumn
holiday:)

If you hope to integrate ARM CI to amplab jenkins, we can offer the arm
instance, and then the ARM job will run together with other x86 jobs, so
maybe there is a guideline to do this? @shane knapp 
would you help us?

On Thu, Sep 12, 2019 at 9:36 PM Sean Owen  wrote:

> I don't know what's involved in actually accepting or operating those
> machines, so can't comment there, but in the meantime it's good that you
> are running these tests and can help report changes needed to keep it
> working with ARM. I would continue with that for now.
>
> On Wed, Sep 11, 2019 at 10:06 PM Tianhua huang 
> wrote:
>
>> Hi all,
>>
>> For the whole work process of spark ARM CI, we want to make 2 things
>> clear.
>>
>> The first thing is:
>> About spark ARM CI, now we have two periodic jobs, one job[1] based on
>> commit[2](which already fixed the replay tests failed issue[3], we made a
>> new test branch based on date 09-09-2019), the other job[4] based on spark
>> master.
>>
>> The first job we test on the specified branch to prove that our ARM CI is
>> good and stable.
>> The second job checks spark master every day, then we can find whether
>> the latest commits affect the ARM CI. According to the build history and
>> result, it shows that some problems are easier to find on ARM like
>> SPARK-28770 <https://issues.apache.org/jira/browse/SPARK-28770>, and it
>> also shows that we would make efforts to trace and figure them out, till
>> now we have found and fixed several problems[5][6][7], thanks everyone of
>> the community :). And we believe that ARM CI is very necessary, right?
>>
>> The second thing is:
>> We plan to run the jobs for a period of time, and you can see the result
>> and logs from 'build history' of the jobs console, if everything goes well
>> for one or two weeks could community accept the ARM CI? or how long the
>> periodic jobs to run then our community could have enough confidence to
>> accept the ARM CI? As you suggested before, it's good to integrate ARM CI
>> to amplab jenkins, we agree that and we can donate the ARM instances and
>> then maintain the ARM-related test jobs together with community, any
>> thoughts?
>>
>> Thank you all!
>>
>> [1]
>> http://status.openlabtesting.org/job/spark-unchanged-branch-unit-test-hadoop-2.7-arm64
>> [2]
>> https://github.com/apache/spark/commit/0ed9fae45769d4b06b8cf8128f462f09ff3d9a72
>> [3] https://issues.apache.org/jira/browse/SPARK-28770
>> [4]
>> http://status.openlabtesting.org/builds?job_name=spark-master-unit-test-hadoop-2.7-arm64
>> [5] https://github.com/apache/spark/pull/25186
>> [6] https://github.com/apache/spark/pull/25279
>> [7] https://github.com/apache/spark/pull/25673
>>
>>
>>
>> On Fri, Aug 16, 2019 at 11:24 PM Sean Owen  wrote:
>>
>>> Yes, I think it's just local caching. After you run the build you should
>>> find lots of stuff cached at ~/.m2/repository and it won't download every
>>> time.
>>>
>>> On Fri, Aug 16, 2019 at 3:01 AM bo zhaobo 
>>> wrote:
>>>
>>>> Hi Sean,
>>>> Thanks for reply. And very apologize for making you confused.
>>>> I know the dependencies will be downloaded from SBT or Maven. But the
>>>> Spark QA job also exec "mvn clean package", why the log didn't print
>>>> "downloading some jar from Maven central [1] and build very fast. Is the
>>>> reason that Spark Jenkins build the Spark jars in the physical machiines
>>>> and won't destrory the test env after job is finished? Then the other job
>>>> build Spark will get the dependencies jar from the local cached, as the
>>>> previous jobs exec "mvn package", those dependencies had been downloaded
>>>> already on local worker machine. Am I right? Is that the reason the job
>>>> log[1] didn't print any downloading information from Maven Central?
>>>>
>>>> Thank you very much.
>>>>
>>>> [1]
>>>> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test/job/spark-master-test-maven-hadoop-2.6-ubuntu-testing/lastBuild/consoleFull
>>>>
>>>>
>>>> Best regards
>>>>
>>>> ZhaoBo
>>>>
>>>> [image: Mailtrack]
>>>> <https://mailtrack.io?utm_source=gmail_medium=signature_campaign=signaturevirality5;>
>>>>  Sender
>>>> notified by
>>>> Mailtrack
>>>> <htt

Re: Ask for ARM CI for spark

2019-08-15 Thread Tianhua huang
@Sean Owen  , thanks for your reply.
I agree with you basically, two points I have to say :)
First, maybe I didn't express clear enough, now we download from Maven
Central in our test system, seems the community jenkins ci tests never
download the jar packages from maven centry repo, our question is if there
is an internal maven repo in community jenkins?
Second, about the failed tests, of course we will continue to figure them
out, and hope if someone can help/join us:) but I am afraid if we have to
wait it to be "stable"(maybe you mean no failed tests?) And the failed
tests of ReplayListenerSuite mentioned last mail are passed before, we
suspect it introduced by https://github.com/apache/spark/pull/23767, we
revert the code and the tests passed, so hope someone can help us to look
deep into it. Now the tests we took based on master, if some modification
introduce errors, the test will fail, I think this is one reason we need
arm ci.

Thank you all :)

On Thu, Aug 15, 2019 at 9:58 PM Sean Owen  wrote:

> I think the right goal is to fix the remaining issues first. If we set up
> CI/CD it will only tell us there are still some test failures. If it's
> stable, and not hard to add to the existing CI/CD, yes it could be done
> automatically later. You can continue to test on ARM independently for now.
>
> It sounds indeed like there are some networking problems in the test
> system if you're not able to download from Maven Central. That rarely takes
> significant time, and there aren't project-specific mirrors here. You might
> be able to point at a closer public mirror, depending on where you are.
>
> On Thu, Aug 15, 2019 at 5:43 AM Tianhua huang 
> wrote:
>
>> Hi all,
>>
>> I want to discuss spark ARM CI again, we took some tests on arm instance
>> based on master and the job includes
>> https://github.com/theopenlab/spark/pull/13  and k8s integration
>> https://github.com/theopenlab/spark/pull/17/ , there are several things
>> I want to talk about:
>>
>> First, about the failed tests:
>> 1.we have fixed some problems like
>> https://github.com/apache/spark/pull/25186 and
>> https://github.com/apache/spark/pull/25279, thanks sean owen and others
>> to help us.
>> 2.we tried k8s integration test on arm, and met an error: apk fetch
>> hangs,  the tests passed  after adding '--network host' option for command
>> `docker build`, see:
>>
>> https://github.com/theopenlab/spark/pull/17/files#diff-5b731b14068240d63a93c393f6f9b1e8R176
>> , the solution refers to
>> https://github.com/gliderlabs/docker-alpine/issues/307  and I don't know
>> whether it happened once in community CI, or maybe we should submit a pr to
>> pass  '--network host' when `docker build`?
>> 3.we found there are two tests failed after the commit
>> https://github.com/apache/spark/pull/23767  :
>>ReplayListenerSuite:
>>- ...
>>- End-to-end replay *** FAILED ***
>>  "[driver]" did not equal "[1]" (JsonProtocolSuite.scala:622)
>>- End-to-end replay with compression *** FAILED ***
>>  "[driver]" did not equal "[1]" (JsonProtocolSuite.scala:622)
>>
>> we tried to revert the commit and then the tests passed, the
>> patch is too big and so sorry we can't find the reason till now, if you are
>> interesting please try it, and it will be very appreciate  if
>> someone can help us to figure it out.
>>
>> Second, about the test time, we increased the flavor of arm instance to
>> 16U16G, but seems there was no significant improvement, the k8s integration
>> test took about one and a half hours, and the QA test(like
>> spark-master-test-maven-hadoop-2.7 community jenkins job) took about
>> seventeen hours(it is too long :(), we suspect that the reason is the
>> performance and network,
>> we split the jobs based on projects such as sql, core and so on, the time
>> can be decrease to about seven hours, see
>> https://github.com/theopenlab/spark/pull/19 We found the Spark QA tests
>> like  https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test/   ,
>> it looks all tests seem never download the jar packages from maven centry
>> repo(such as
>> https://repo.maven.apache.org/maven2/org/opencypher/okapi-api/0.4.2/okapi-api-0.4.2.jar).
>> So we want to know how the jenkins jobs can do that, is there a internal
>> maven repo launched? maybe we can do the same thing to avoid the network
>> connection cost during downloading the dependent jar packages.
>>
>> Third, the most important thing, it's about ARM CI of spark, we believe
>> that it is necessary, right? And you c

Re: Ask for ARM CI for spark

2019-08-15 Thread Tianhua huang
Hi all,

I want to discuss spark ARM CI again, we took some tests on arm instance
based on master and the job includes
https://github.com/theopenlab/spark/pull/13  and k8s integration
https://github.com/theopenlab/spark/pull/17/ , there are several things I
want to talk about:

First, about the failed tests:
1.we have fixed some problems like
https://github.com/apache/spark/pull/25186 and
https://github.com/apache/spark/pull/25279, thanks sean owen and others to
help us.
2.we tried k8s integration test on arm, and met an error: apk fetch
hangs,  the tests passed  after adding '--network host' option for command
`docker build`, see:

https://github.com/theopenlab/spark/pull/17/files#diff-5b731b14068240d63a93c393f6f9b1e8R176
, the solution refers to
https://github.com/gliderlabs/docker-alpine/issues/307  and I don't know
whether it happened once in community CI, or maybe we should submit a pr to
pass  '--network host' when `docker build`?
3.we found there are two tests failed after the commit
https://github.com/apache/spark/pull/23767  :
   ReplayListenerSuite:
   - ...
   - End-to-end replay *** FAILED ***
 "[driver]" did not equal "[1]" (JsonProtocolSuite.scala:622)
   - End-to-end replay with compression *** FAILED ***
 "[driver]" did not equal "[1]" (JsonProtocolSuite.scala:622)

we tried to revert the commit and then the tests passed, the patch
is too big and so sorry we can't find the reason till now, if you are
interesting please try it, and it will be very appreciate  if
someone can help us to figure it out.

Second, about the test time, we increased the flavor of arm instance to
16U16G, but seems there was no significant improvement, the k8s integration
test took about one and a half hours, and the QA test(like
spark-master-test-maven-hadoop-2.7 community jenkins job) took about
seventeen hours(it is too long :(), we suspect that the reason is the
performance and network,
we split the jobs based on projects such as sql, core and so on, the time
can be decrease to about seven hours, see
https://github.com/theopenlab/spark/pull/19 We found the Spark QA tests
like  https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test/   , it
looks all tests seem never download the jar packages from maven centry
repo(such as
https://repo.maven.apache.org/maven2/org/opencypher/okapi-api/0.4.2/okapi-api-0.4.2.jar).
So we want to know how the jenkins jobs can do that, is there a internal
maven repo launched? maybe we can do the same thing to avoid the network
connection cost during downloading the dependent jar packages.

Third, the most important thing, it's about ARM CI of spark, we believe
that it is necessary, right? And you can see we really made a lot of
efforts, now the basic arm build/test jobs is ok, so we suggest to add arm
jobs to community, we can set them to novoting firstly, and improve/rich
the jobs step by step. Generally, there are two ways in our mind to
integrate the ARM CI for spark:
 1) We introduce openlab ARM CI into spark as a custom CI system. We
provide human resources and test ARM VMs, also we will focus on the ARM
related issues about Spark. We will push the PR into community.
 2) We donate ARM VM resources into existing amplab Jenkins. We still
provide human resources, focus on the ARM related issues about Spark and
push the PR into community.
Both options, we will provide human resources to maintain, of course it
will be great if we can work together. So please tell us which option you
would like? And let's move forward. Waiting for your reply, thank you very
much.

On Wed, Aug 14, 2019 at 10:30 AM Tianhua huang 
wrote:

> OK, thanks.
>
> On Tue, Aug 13, 2019 at 8:37 PM Sean Owen  wrote:
>
>> -dev@ -- it's better not to send to the whole list to discuss specific
>> changes or issues from here. You can reply on the pull request.
>> I don't know what the issue is either at a glance.
>>
>> On Tue, Aug 13, 2019 at 2:54 AM Tianhua huang 
>> wrote:
>>
>>> Hi all,
>>>
>>> About the arm test of spark, recently we found two tests failed after
>>> the commit https://github.com/apache/spark/pull/23767:
>>>ReplayListenerSuite:
>>>- ...
>>>- End-to-end replay *** FAILED ***
>>>  "[driver]" did not equal "[1]" (JsonProtocolSuite.scala:622)
>>>- End-to-end replay with compression *** FAILED ***
>>>  "[driver]" did not equal "[1]" (JsonProtocolSuite.scala:622)
>>>
>>> We tried to revert the commit and then the tests passed, the patch is
>>> too big and so sorry we can't find the reason till now, if you are
>>> interesting please try it, and it will be very appreciate  if
>>> someone can help us to figure i

Re: Ask for ARM CI for spark

2019-08-13 Thread Tianhua huang
Hi all,

About the arm test of spark, recently we found two tests failed after the
commit https://github.com/apache/spark/pull/23767:
   ReplayListenerSuite:
   - ...
   - End-to-end replay *** FAILED ***
 "[driver]" did not equal "[1]" (JsonProtocolSuite.scala:622)
   - End-to-end replay with compression *** FAILED ***
 "[driver]" did not equal "[1]" (JsonProtocolSuite.scala:622)

We tried to revert the commit and then the tests passed, the patch is too
big and so sorry we can't find the reason till now, if you are interesting
please try it, and it will be very appreciate  if someone can help
us to figure it out.

On Tue, Aug 6, 2019 at 9:08 AM bo zhaobo 
wrote:

> Hi shane,
> Thanks for your reply. I will wait for you back. ;-)
>
> Thanks,
> Best regards
> ZhaoBo
>
>
>
> [image: Mailtrack]
> <https://mailtrack.io?utm_source=gmail_medium=signature_campaign=signaturevirality5;>
>  Sender
> notified by
> Mailtrack
> <https://mailtrack.io?utm_source=gmail_medium=signature_campaign=signaturevirality5;>
>  19/08/06
> 上午09:06:23
>
> shane knapp  于2019年8月2日周五 下午10:41写道:
>
>> i'm out of town, but will answer some of your questions next week.
>>
>> On Fri, Aug 2, 2019 at 2:39 AM bo zhaobo 
>> wrote:
>>
>>>
>>> Hi Team,
>>>
>>> Any updates about the CI details? ;-)
>>>
>>> Also, I will also need your kind help about Spark QA test, could any one
>>> can tell us how to trigger that tests? When? How?  So far, I haven't
>>> notices how it works.
>>>
>>> Thanks
>>>
>>> Best Regards,
>>>
>>> ZhaoBo
>>>
>>>
>>>
>>> [image: Mailtrack]
>>> <https://mailtrack.io?utm_source=gmail_medium=signature_campaign=signaturevirality5;>
>>>  Sender
>>> notified by
>>> Mailtrack
>>> <https://mailtrack.io?utm_source=gmail_medium=signature_campaign=signaturevirality5;>
>>>  19/08/02
>>> 下午05:37:30
>>>
>>> bo zhaobo  于2019年7月31日周三 上午11:56写道:
>>>
>>>> Hi, team.
>>>> I want to make the same test on ARM like existing CI does(x86). As
>>>> building and testing the whole spark projects will cost too long time, so I
>>>> plan to split them to multiple jobs to run for lower time cost. But I
>>>> cannot see what the existing CI[1] have done(so many private scripts
>>>> called), so could any CI maintainers help/tell us for how to split them and
>>>> the details about different CI jobs does? Such as PR title contains [SQL],
>>>> [INFRA], [ML], [DOC], [CORE], [PYTHON], [k8s], [DSTREAMS], [MLlib],
>>>> [SCHEDULER], [SS],[YARN], [BUIILD] and etc..I found each of them seems run
>>>> the different CI job.
>>>>
>>>> @shane knapp,
>>>> Oh, sorry for disturb. I found your email looks like from 'berkeley.edu',
>>>> are you the good guy who we are looking for help about this? ;-)
>>>> If so, could you give some helps or advices? Thank you.
>>>>
>>>> Thank you very much,
>>>>
>>>> Best Regards,
>>>>
>>>> ZhaoBo
>>>>
>>>> [1] https://amplab.cs.berkeley.edu/jenkins
>>>>
>>>>
>>>>
>>>>
>>>> [image: Mailtrack]
>>>> <https://mailtrack.io?utm_source=gmail_medium=signature_campaign=signaturevirality5;>
>>>>  Sender
>>>> notified by
>>>> Mailtrack
>>>> <https://mailtrack.io?utm_source=gmail_medium=signature_campaign=signaturevirality5;>
>>>>  19/07/31
>>>> 上午11:53:36
>>>>
>>>> Tianhua huang  于2019年7月29日周一 上午9:38写道:
>>>>
>>>>> @Sean Owen   Thank you very much. And I saw your
>>>>> reply comment in https://issues.apache.org/jira/browse/SPARK-28519, I
>>>>> will test with modification and to see whether there are other similar
>>>>> tests fail, and will address them together in one pull request.
>>>>>
>>>>> On Sat, Jul 27, 2019 at 9:04 PM Sean Owen  wrote:
>>>>>
>>>>>> Great thanks - we can take this to JIRAs now.
>>>>>> I think it's worth changing the implementation of atanh if the test
>>>>>> value just reflects what Spark does, and there's evidence is a little bit
>>>>>> inaccurate.
>>>>>> There's an equivalent formula which seems to have better accuracy.

Re: Ask for ARM CI for spark

2019-07-28 Thread Tianhua huang
@Sean Owen   Thank you very much. And I saw your reply
comment in https://issues.apache.org/jira/browse/SPARK-28519, I will test
with modification and to see whether there are other similar tests fail,
and will address them together in one pull request.

On Sat, Jul 27, 2019 at 9:04 PM Sean Owen  wrote:

> Great thanks - we can take this to JIRAs now.
> I think it's worth changing the implementation of atanh if the test value
> just reflects what Spark does, and there's evidence is a little bit
> inaccurate.
> There's an equivalent formula which seems to have better accuracy.
>
> On Fri, Jul 26, 2019 at 10:02 PM Takeshi Yamamuro 
> wrote:
>
>> Hi, all,
>>
>> FYI:
>> >> @Yuming Wang the results in float8.sql are from PostgreSQL directly?
>> >> Interesting if it also returns the same less accurate result, which
>> >> might suggest it's more to do with underlying OS math libraries. You
>> >> noted that these tests sometimes gave platform-dependent differences
>> >> in the last digit, so wondering if the test value directly reflects
>> >> PostgreSQL or just what we happen to return now.
>>
>> The results in float8.sql.out were recomputed in Spark/JVM.
>> The expected output of the PostgreSQL test is here:
>> https://github.com/postgres/postgres/blob/master/src/test/regress/expected/float8.out#L493
>>
>> As you can see in the file (float8.out), the results other than atanh
>> also are different between Spark/JVM and PostgreSQL.
>> For example, the answers of acosh are:
>> -- PostgreSQL
>>
>> https://github.com/postgres/postgres/blob/master/src/test/regress/expected/float8.out#L487
>> 1.31695789692482
>>
>> -- Spark/JVM
>>
>> https://github.com/apache/spark/blob/master/sql/core/src/test/resources/sql-tests/results/pgSQL/float8.sql.out#L523
>> 1.3169578969248166
>>
>> btw, the PostgreSQL implementation for atanh just calls atanh in math.h:
>>
>> https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/float.c#L2606
>>
>> Bests,
>> Takeshi
>>
>>


Re: Ask for ARM CI for spark

2019-07-26 Thread Tianhua huang
Hi, all


Sorry to disturb again, there are several sql tests failed on arm64
instance:

   - pgSQL/float8.sql *** FAILED ***
   Expected "0.549306144334054[9]", but got "0.549306144334054[8]" Result
   did not match for query #56
   SELECT atanh(double('0.5')) (SQLQueryTestSuite.scala:362)
   - pgSQL/numeric.sql *** FAILED ***
   Expected "2 2247902679199174[72 224790267919917955.1326161858
   4 7405685069595001 7405685069594999.0773399947
   5 5068226527.321263 5068226527.3212726541
   6 281839893606.99365 281839893606.9937234336
   7 1716699575118595840 1716699575118597095.4233081991
   8 167361463828.0749 167361463828.0749132007
   9 107511333880051856] 107511333880052007", but got "2
   2247902679199174[40224790267919917955.1326161858
   4 7405685069595001 7405685069594999.0773399947
   5 5068226527.321263 5068226527.3212726541
   6 281839893606.99365 281839893606.9937234336
   7 1716699575118595580 1716699575118597095.4233081991
   8 167361463828.0749 167361463828.0749132007
   9 107511333880051872] 107511333880052007" Result did not match for
   query #496
   SELECT t1.id1, t1.result, t2.expected
   FROM num_result t1, num_exp_power_10_ln t2
   WHERE t1.id1 = t2.id
   AND t1.result != t2.expected (SQLQueryTestSuite.scala:362)

The first test failed, because the value of math.log(3.0) is different on
aarch64:

# on x86_64:
scala> val a = 0.5
a: Double = 0.5

scala> a * math.log((1.0 + a) / (1.0 - a))
res1: Double = 0.5493061443340549

scala> math.log((1.0 + a) / (1.0 - a))
res2: Double = 1.0986122886681098

# on aarch64:

scala> val a = 0.5

a: Double = 0.5

scala> a * math.log((1.0 + a) / (1.0 - a))
res20: Double = 0.5493061443340548

scala> math.log((1.0 + a) / (1.0 - a))

res21: Double = 1.0986122886681096

And I tried other several numbers like math.log(4.0) and math.log(5.0) and
they are same, I don't know why math.log(3.0) is so special? But the result
is different indeed on aarch64. If you are interesting, please try it.

The second test failed, because some values of pow(10, x) is different on
aarch64, according to sql tests of spark, I took similar tests on aarch64
and x86_64, take '-83028485' as example:

# on x86_64:
scala> import java.lang.Math._
import java.lang.Math._
scala> var a = -83028485
a: Int = -83028485
scala> abs(a)
res4: Int = 83028485
scala> math.log(abs(a))
res5: Double = 18.234694299654787
scala> pow(10, math.log(abs(a)))
res6: Double = 1.71669957511859584E18

# on aarch64:

scala> var a = -83028485
a: Int = -83028485
scala> abs(a)
res38: Int = 83028485

scala> math.log(abs(a))

res39: Double = 18.234694299654787
scala> pow(10, math.log(abs(a)))
res40: Double = 1.71669957511859558E18

I send an email to jdk-dev, hope someone can help, and also I proposed this
to JIRA  https://issues.apache.org/jira/browse/SPARK-28519, , if you are
interesting, welcome to join and discuss, thank you very much.

On Thu, Jul 18, 2019 at 11:12 AM Tianhua huang 
wrote:

> Thanks for your reply.
>
> About the first problem we didn't find any other reason in log, just found
> timeout to wait the executor up, and after increase the timeout from 1
> ms to 3(even 2)ms,
> https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/SparkContextSuite.scala#L764
>
> https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/SparkContextSuite.scala#L792
> the test passed, and there are more than one executor up, not sure whether
> it's related with the flavor of our aarch64 instance? Now the flavor of the
> instance is 8C8G. Maybe we will try the bigger flavor later. Or any one has
> other suggestion, please contact me, thank you.
>
> About the second problem, I proposed a pull request to apache/spark,
> https://github.com/apache/spark/pull/25186  if you have time, would you
> please to help to review it, thank you very much.
>
> On Wed, Jul 17, 2019 at 8:37 PM Sean Owen  wrote:
>
>> On Wed, Jul 17, 2019 at 6:28 AM Tianhua huang 
>> wrote:
>> > Two failed and the reason is 'Can't find 1 executors before 1
>> milliseconds elapsed', see below, then we try increase timeout the tests
>> passed, so wonder if we can increase the timeout? and here I have another
>> question about
>> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/TestUtils.scala#L285,
>> why is not >=? see the comment of the function, it should be >=?
>> >
>>
>> I think it's ">" because the driver is also an executor, but not 100%
>> sure. In any event it passes in general.
>> These errors typically mean "I didn't start successfully" for some
>> other reason that may be in the logs.
>>
>> > The other two failed and the reason is '2143289344 equaled 2143289344',
>> thi

Re: Ask for ARM CI for spark

2019-07-17 Thread Tianhua huang
Thanks for your reply.

About the first problem we didn't find any other reason in log, just found
timeout to wait the executor up, and after increase the timeout from 1
ms to 3(even 2)ms,
https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/SparkContextSuite.scala#L764

https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/SparkContextSuite.scala#L792
the test passed, and there are more than one executor up, not sure whether
it's related with the flavor of our aarch64 instance? Now the flavor of the
instance is 8C8G. Maybe we will try the bigger flavor later. Or any one has
other suggestion, please contact me, thank you.

About the second problem, I proposed a pull request to apache/spark,
https://github.com/apache/spark/pull/25186  if you have time, would you
please to help to review it, thank you very much.

On Wed, Jul 17, 2019 at 8:37 PM Sean Owen  wrote:

> On Wed, Jul 17, 2019 at 6:28 AM Tianhua huang 
> wrote:
> > Two failed and the reason is 'Can't find 1 executors before 1
> milliseconds elapsed', see below, then we try increase timeout the tests
> passed, so wonder if we can increase the timeout? and here I have another
> question about
> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/TestUtils.scala#L285,
> why is not >=? see the comment of the function, it should be >=?
> >
>
> I think it's ">" because the driver is also an executor, but not 100%
> sure. In any event it passes in general.
> These errors typically mean "I didn't start successfully" for some
> other reason that may be in the logs.
>
> > The other two failed and the reason is '2143289344 equaled 2143289344',
> this because the value of floatToRawIntBits(0.0f/0.0f) on aarch64 platform
> is 2143289344 and equals to floatToRawIntBits(Float.NaN). About this I send
> email to jdk-dev and proposed a topic on scala community
> https://users.scala-lang.org/t/the-value-of-floattorawintbits-0-0f-0-0f-is-different-on-x86-64-and-aarch64-platforms/4845
> and https://github.com/scala/bug/issues/11632, I thought it's something
> about jdk or scala, but after discuss, it should related with platform, so
> seems the following asserts is not appropriate?
> https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWindowFunctionsSuite.scala#L704-L705
> and
> https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala#L732-L733
>
> These tests could special-case execution on ARM, like you'll see some
> tests handle big-endian architectures.
>


Re: Ask for ARM CI for spark

2019-07-17 Thread Tianhua huang
Hi all,

We run all unit tests for spark on arm64 platform, after effort there are
four tests FAILED, see
https://logs.openlabtesting.org/logs/4/4/ae5ebaddd6ba6eba5a525b2bf757043ebbe78432/check/spark-build-arm64/9ecccad/job-output.txt.gz

Two failed and the reason is 'Can't find 1 executors before 1
milliseconds elapsed', see below, then we try increase timeout the tests
passed, so wonder if we can increase the timeout? and here I have another
question about
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/TestUtils.scala#L285,
why is not >=? see the comment of the function, it should be >=?

- test driver discovery under local-cluster mode *** FAILED ***
  java.util.concurrent.TimeoutException: Can't find 1 executors before
1 milliseconds elapsed
  at org.apache.spark.TestUtils$.waitUntilExecutorsUp(TestUtils.scala:293)
  at 
org.apache.spark.SparkContextSuite.$anonfun$new$78(SparkContextSuite.scala:753)
  at 
org.apache.spark.SparkContextSuite.$anonfun$new$78$adapted(SparkContextSuite.scala:741)
  at org.apache.spark.SparkFunSuite.withTempDir(SparkFunSuite.scala:161)
  at 
org.apache.spark.SparkContextSuite.$anonfun$new$77(SparkContextSuite.scala:741)
  at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
  at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
  at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
  at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
  at org.scalatest.Transformer.apply(Transformer.scala:22)

- test gpu driver resource files and discovery under local-cluster
mode *** FAILED ***
  java.util.concurrent.TimeoutException: Can't find 1 executors before
1 milliseconds elapsed
  at org.apache.spark.TestUtils$.waitUntilExecutorsUp(TestUtils.scala:293)
  at 
org.apache.spark.SparkContextSuite.$anonfun$new$80(SparkContextSuite.scala:781)
  at 
org.apache.spark.SparkContextSuite.$anonfun$new$80$adapted(SparkContextSuite.scala:761)
  at org.apache.spark.SparkFunSuite.withTempDir(SparkFunSuite.scala:161)
  at 
org.apache.spark.SparkContextSuite.$anonfun$new$79(SparkContextSuite.scala:761)
  at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
  at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
  at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
  at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
  at org.scalatest.Transformer.apply(Transformer.scala:22)

The other two failed and the reason is '2143289344 equaled
2143289344', this because the value of floatToRawIntBits(0.0f/0.0f) on
aarch64 platform is 2143289344 and equals to
floatToRawIntBits(Float.NaN). About this I send email to jdk-dev and
proposed a topic on scala community
https://users.scala-lang.org/t/the-value-of-floattorawintbits-0-0f-0-0f-is-different-on-x86-64-and-aarch64-platforms/4845
and https://github.com/scala/bug/issues/11632, I thought it's
something about jdk or scala, but after discuss, it should related
with platform, so seems the following asserts is not appropriate?
https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWindowFunctionsSuite.scala#L704-L705
and 
https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala#L732-L733

 - SPARK-26021: NaN and -0.0 in grouping expressions *** FAILED ***
   2143289344 equaled 2143289344 (DataFrameAggregateSuite.scala:732)
 - NaN and -0.0 in window partition keys *** FAILED ***
   2143289344 equaled 2143289344 (DataFrameWindowFunctionsSuite.scala:704)

About the failed tests fixing, we are waiting for your suggestions,
thank you very much.


On Wed, Jul 10, 2019 at 10:07 AM Tianhua huang 
wrote:

> Hi all,
>
> I am glad to tell you there is a new progress of build/test spark on
> aarch64 server, the tests are running, see the build/test detail log
> https://logs.openlabtesting.org/logs/1/1/419fcb11764048d5a3cda186ea76dd43249e1f97/check/spark-build-arm64/75cc6f5/job-output.txt.gz
>  and
> the aarch64 instance info see
> https://logs.openlabtesting.org/logs/1/1/419fcb11764048d5a3cda186ea76dd43249e1f97/check/spark-build-arm64/75cc6f5/zuul-info/zuul-info.ubuntu-xenial-arm64.txt
>  In
> order to enable the test, I made some modification, the major one is to
> build leveldbjni local package, I forked fusesource/leveldbjni and
> chirino/leveldb repos, and made some modification to make sure to build the
> local package, see https://github.com/huangtianhua/leveldbjni/pull/1 and
> https://github.com/huangtianhua/leveldbjni/pull/2 , then to use it in
> spark, the detail you can find in
> https://github.com/theopenlab/spark/pull/1
>
> Now the tests are not all successful, I will try to fix it and any
> suggestion is welcome, thank you all.
>
> On Mon, Jul 1, 2019 at 5:25 PM Tianhua huang 
> wrote:
>
>> We are focus on the arm instance of cloud, and now I use

Re: Ask for ARM CI for spark

2019-07-09 Thread Tianhua huang
Hi all,

I am glad to tell you there is a new progress of build/test spark on
aarch64 server, the tests are running, see the build/test detail log
https://logs.openlabtesting.org/logs/1/1/419fcb11764048d5a3cda186ea76dd43249e1f97/check/spark-build-arm64/75cc6f5/job-output.txt.gz
and
the aarch64 instance info see
https://logs.openlabtesting.org/logs/1/1/419fcb11764048d5a3cda186ea76dd43249e1f97/check/spark-build-arm64/75cc6f5/zuul-info/zuul-info.ubuntu-xenial-arm64.txt
In
order to enable the test, I made some modification, the major one is to
build leveldbjni local package, I forked fusesource/leveldbjni and
chirino/leveldb repos, and made some modification to make sure to build the
local package, see https://github.com/huangtianhua/leveldbjni/pull/1 and
https://github.com/huangtianhua/leveldbjni/pull/2 , then to use it in
spark, the detail you can find in https://github.com/theopenlab/spark/pull/1


Now the tests are not all successful, I will try to fix it and any
suggestion is welcome, thank you all.

On Mon, Jul 1, 2019 at 5:25 PM Tianhua huang 
wrote:

> We are focus on the arm instance of cloud, and now I use the arm instance
> of vexxhost cloud to run the build job which mentioned above, the
> specification of the arm instance is 8VCPU and 8GB of RAM,
> and we can use bigger flavor to create the arm instance to run the job, if
> need be.
>
> On Fri, Jun 28, 2019 at 6:55 PM Steve Loughran 
> wrote:
>
>>
>> Be interesting to see how well a Pi4 works; with only 4GB of RAM you
>> wouldn't compile with it, but you could try installing the spark jar bundle
>> and then run against some NFS mounted disks:
>> https://www.raspberrypi.org/magpi/raspberry-pi-4-specs-benchmarks/ ;
>> unlikely to be fast, but it'd be an efficient kind of slow
>>
>> On Fri, Jun 28, 2019 at 3:08 AM Rui Chen  wrote:
>>
>>> >  I think any AA64 work is going to have to define very clearly what
>>> "works" is defined as
>>>
>>> +1
>>> It's very valuable to build a clear scope of these projects
>>> functionality for ARM platform in upstream community, it bring confidence
>>> to end user and customers when they plan to deploy these projects on ARM.
>>>
>>> This is absolute long term work, let's to make it step by step, CI,
>>> testing, issue and resolving.
>>>
>>> Steve Loughran  于2019年6月27日周四 下午9:22写道:
>>>
>>>> level db and native codecs are invariably a problem here, as is
>>>> anything else doing misaligned IO. Protobuf has also had "issues" in the
>>>> past
>>>>
>>>> see https://issues.apache.org/jira/browse/HADOOP-16100
>>>>
>>>> I think any AA64 work is going to have to define very clearly what
>>>> "works" is defined as; spark standalone with a specific set of codecs is
>>>> probably the first thing to aim for -no Snappy or lz4.
>>>>
>>>> Anything which goes near: protobuf, checksums, native code, etc is in
>>>> trouble. Don't try and deploy with HDFS as the cluster FS, would be my
>>>> recommendation.
>>>>
>>>> If you want a cluster use NFS or one of google GCS, Azure WASB for the
>>>> cluster FS. And before trying either of those cloud store, run the
>>>> filesystem connector test suites (hadoop-azure; google gcs github) to see
>>>> that they work. If the foundational FS test suites fail, nothing else will
>>>> work
>>>>
>>>>
>>>>
>>>> On Thu, Jun 27, 2019 at 3:09 AM Tianhua huang <
>>>> huangtianhua...@gmail.com> wrote:
>>>>
>>>>> I took the ut tests on my arm instance before and reported an issue in
>>>>> https://issues.apache.org/jira/browse/SPARK-27721,  and seems there
>>>>> was no leveldbjni native package for aarch64 in leveldbjni-all.jar(or 1.8)
>>>>> https://mvnrepository.com/artifact/org.fusesource.leveldbjni/leveldbjni-all/1.8
>>>>> , we can find https://github.com/fusesource/leveldbjni/pull/82 this
>>>>> pr added the aarch64 support and merged on 2 Nov 2017, but the latest
>>>>> release of the repo is  on 17 Oct 2013, unfortunately it didn't
>>>>> include the aarch64 supporting.
>>>>>
>>>>> I will running the test on the job mentioned above, and will try to
>>>>> fix the issue above, or if anyone have any idea of it, welcome reply me,
>>>>> thank you.
>>>>>
>>>>>
>>>>> On Wed, Jun 26, 2019 at 8:11 PM Sean Owen  wrote:
>>>>>
>>>>>> Can yo

Re: Ask for ARM CI for spark

2019-07-01 Thread Tianhua huang
We are focus on the arm instance of cloud, and now I use the arm instance
of vexxhost cloud to run the build job which mentioned above, the
specification of the arm instance is 8VCPU and 8GB of RAM,
and we can use bigger flavor to create the arm instance to run the job, if
need be.

On Fri, Jun 28, 2019 at 6:55 PM Steve Loughran 
wrote:

>
> Be interesting to see how well a Pi4 works; with only 4GB of RAM you
> wouldn't compile with it, but you could try installing the spark jar bundle
> and then run against some NFS mounted disks:
> https://www.raspberrypi.org/magpi/raspberry-pi-4-specs-benchmarks/ ;
> unlikely to be fast, but it'd be an efficient kind of slow
>
> On Fri, Jun 28, 2019 at 3:08 AM Rui Chen  wrote:
>
>> >  I think any AA64 work is going to have to define very clearly what
>> "works" is defined as
>>
>> +1
>> It's very valuable to build a clear scope of these projects functionality
>> for ARM platform in upstream community, it bring confidence to end user and
>> customers when they plan to deploy these projects on ARM.
>>
>> This is absolute long term work, let's to make it step by step, CI,
>> testing, issue and resolving.
>>
>> Steve Loughran  于2019年6月27日周四 下午9:22写道:
>>
>>> level db and native codecs are invariably a problem here, as is anything
>>> else doing misaligned IO. Protobuf has also had "issues" in the past
>>>
>>> see https://issues.apache.org/jira/browse/HADOOP-16100
>>>
>>> I think any AA64 work is going to have to define very clearly what
>>> "works" is defined as; spark standalone with a specific set of codecs is
>>> probably the first thing to aim for -no Snappy or lz4.
>>>
>>> Anything which goes near: protobuf, checksums, native code, etc is in
>>> trouble. Don't try and deploy with HDFS as the cluster FS, would be my
>>> recommendation.
>>>
>>> If you want a cluster use NFS or one of google GCS, Azure WASB for the
>>> cluster FS. And before trying either of those cloud store, run the
>>> filesystem connector test suites (hadoop-azure; google gcs github) to see
>>> that they work. If the foundational FS test suites fail, nothing else will
>>> work
>>>
>>>
>>>
>>> On Thu, Jun 27, 2019 at 3:09 AM Tianhua huang 
>>> wrote:
>>>
>>>> I took the ut tests on my arm instance before and reported an issue in
>>>> https://issues.apache.org/jira/browse/SPARK-27721,  and seems there
>>>> was no leveldbjni native package for aarch64 in leveldbjni-all.jar(or 1.8)
>>>> https://mvnrepository.com/artifact/org.fusesource.leveldbjni/leveldbjni-all/1.8
>>>> , we can find https://github.com/fusesource/leveldbjni/pull/82 this pr
>>>> added the aarch64 support and merged on 2 Nov 2017, but the latest release
>>>> of the repo is  on 17 Oct 2013, unfortunately it didn't include the
>>>> aarch64 supporting.
>>>>
>>>> I will running the test on the job mentioned above, and will try to fix
>>>> the issue above, or if anyone have any idea of it, welcome reply me, thank
>>>> you.
>>>>
>>>>
>>>> On Wed, Jun 26, 2019 at 8:11 PM Sean Owen  wrote:
>>>>
>>>>> Can you begin by testing yourself? I think the first step is to make
>>>>> sure the build and tests work on ARM. If you find problems you can
>>>>> isolate them and try to fix them, or at least report them. It's only
>>>>> worth getting CI in place when we think builds will work.
>>>>>
>>>>> On Tue, Jun 25, 2019 at 9:26 PM Tianhua huang <
>>>>> huangtianhua...@gmail.com> wrote:
>>>>> >
>>>>> > Thanks Shane :)
>>>>> >
>>>>> > This sounds good, and yes I agree that it's best to keep the
>>>>> test/build infrastructure in one place. If you can't find the ARM resource
>>>>> we are willing to support the ARM instance :)  Our goal is to make more
>>>>> open source software to be more compatible for aarch64 platform, so let's
>>>>> to do it. I will be happy if I can give some help for the goal.
>>>>> >
>>>>> > Waiting for you good news :)
>>>>> >
>>>>> > On Wed, Jun 26, 2019 at 9:47 AM shane knapp 
>>>>> wrote:
>>>>> >>
>>>>> >> ...or via VM as you mentioned earlier.  :)
>>>>> >>
>>>>> >> shane (who will file a JIRA t

Re: Ask for ARM CI for spark

2019-06-26 Thread Tianhua huang
I took the ut tests on my arm instance before and reported an issue in
https://issues.apache.org/jira/browse/SPARK-27721,  and seems there was no
leveldbjni native package for aarch64 in leveldbjni-all.jar(or 1.8)
https://mvnrepository.com/artifact/org.fusesource.leveldbjni/leveldbjni-all/1.8
, we can find https://github.com/fusesource/leveldbjni/pull/82 this pr
added the aarch64 support and merged on 2 Nov 2017, but the latest release
of the repo is  on 17 Oct 2013, unfortunately it didn't include the aarch64
supporting.

I will running the test on the job mentioned above, and will try to fix the
issue above, or if anyone have any idea of it, welcome reply me, thank you.


On Wed, Jun 26, 2019 at 8:11 PM Sean Owen  wrote:

> Can you begin by testing yourself? I think the first step is to make
> sure the build and tests work on ARM. If you find problems you can
> isolate them and try to fix them, or at least report them. It's only
> worth getting CI in place when we think builds will work.
>
> On Tue, Jun 25, 2019 at 9:26 PM Tianhua huang 
> wrote:
> >
> > Thanks Shane :)
> >
> > This sounds good, and yes I agree that it's best to keep the test/build
> infrastructure in one place. If you can't find the ARM resource we are
> willing to support the ARM instance :)  Our goal is to make more open
> source software to be more compatible for aarch64 platform, so let's to do
> it. I will be happy if I can give some help for the goal.
> >
> > Waiting for you good news :)
> >
> > On Wed, Jun 26, 2019 at 9:47 AM shane knapp  wrote:
> >>
> >> ...or via VM as you mentioned earlier.  :)
> >>
> >> shane (who will file a JIRA tomorrow)
> >>
> >> On Tue, Jun 25, 2019 at 6:44 PM shane knapp 
> wrote:
> >>>
> >>> i'd much prefer that we keep the test/build infrastructure in one
> place.
> >>>
> >>> we don't have ARM hardware, but there's a slim possibility i can scare
> something up in our older research stock...
> >>>
> >>> another option would be to run the build in a arm-based docker
> container, which (according to the intarwebs) is possible.
> >>>
> >>> shane
> >>>
> >>> On Tue, Jun 25, 2019 at 6:35 PM Tianhua huang <
> huangtianhua...@gmail.com> wrote:
> >>>>
> >>>> I forked apache/spark project and propose a job(
> https://github.com/theopenlab/spark/pull/1) for spark building in OpenLab
> ARM instance, this is the first step to build spark on ARM,  I can enable a
> periodic job for arm building for apache/spark master if you guys like.
> Later I will run tests for spark. I also willing to be the maintainer of
> the arm ci of spark.
> >>>>
> >>>> Thanks for you attention.
> >>>>
> >>>> On Thu, Jun 20, 2019 at 10:17 AM Tianhua huang <
> huangtianhua...@gmail.com> wrote:
> >>>>>
> >>>>> Thanks Sean.
> >>>>>
> >>>>> I am very happy to hear that the community will put effort to fix
> the ARM-related issues. I'd be happy to help if you like. And could you
> give the trace link of this issue, then I can check it is fixed or not,
> thank you.
> >>>>> As far as I know the old versions of spark support ARM, and now the
> new versions don't, this just shows that we need a CI to check whether the
> spark support ARM and whether some modification break it.
> >>>>> I will add a demo job in OpenLab to build spark on ARM and do a
> simple UT test. Later I will give the job link.
> >>>>>
> >>>>> Let me know what you think.
> >>>>>
> >>>>> Thank you all!
> >>>>>
> >>>>>
> >>>>> On Wed, Jun 19, 2019 at 8:47 PM Sean Owen  wrote:
> >>>>>>
> >>>>>> I'd begin by reporting and fixing ARM-related issues in the build.
> If
> >>>>>> they're small, of course we should do them. If it requires
> significant
> >>>>>> modifications, we can discuss how much Spark can support ARM. I
> don't
> >>>>>> think it's yet necessary for the Spark project to run these CI
> builds
> >>>>>> until that point, but it's always welcome if people are testing that
> >>>>>> separately.
> >>>>>>
> >>>>>> On Wed, Jun 19, 2019 at 7:41 AM Holden Karau 
> wrote:
> >>>>>> >
> >>>>>> > Moving to dev@ for increased visibility among the developers.
> >>>>>> >
> &g

Re: Ask for ARM CI for spark

2019-06-25 Thread Tianhua huang
Thanks Shane :)

This sounds good, and yes I agree that it's best to keep the test/build
infrastructure in one place. If you can't find the ARM resource we are
willing to support the ARM instance :)  Our goal is to make more open
source software to be more compatible for aarch64 platform, so let's to do
it. I will be happy if I can give some help for the goal.

Waiting for you good news :)

On Wed, Jun 26, 2019 at 9:47 AM shane knapp  wrote:

> ...or via VM as you mentioned earlier.  :)
>
> shane (who will file a JIRA tomorrow)
>
> On Tue, Jun 25, 2019 at 6:44 PM shane knapp  wrote:
>
>> i'd much prefer that we keep the test/build infrastructure in one place.
>>
>> we don't have ARM hardware, but there's a slim possibility i can scare
>> something up in our older research stock...
>>
>> another option would be to run the build in a arm-based docker container,
>> which (according to the intarwebs) is possible.
>>
>> shane
>>
>> On Tue, Jun 25, 2019 at 6:35 PM Tianhua huang 
>> wrote:
>>
>>> I forked apache/spark project and propose a job(
>>> https://github.com/theopenlab/spark/pull/1) for spark building in
>>> OpenLab ARM instance, this is the first step to build spark on ARM,  I can
>>> enable a periodic job for arm building for apache/spark master if you
>>> guys like.  Later I will run tests for spark. I also willing to be the
>>> maintainer of the arm ci of spark.
>>>
>>> Thanks for you attention.
>>>
>>> On Thu, Jun 20, 2019 at 10:17 AM Tianhua huang <
>>> huangtianhua...@gmail.com> wrote:
>>>
>>>> Thanks Sean.
>>>>
>>>> I am very happy to hear that the community will put effort to fix the
>>>> ARM-related issues. I'd be happy to help if you like. And could you give
>>>> the trace link of this issue, then I can check it is fixed or not, thank
>>>> you.
>>>> As far as I know the old versions of spark support ARM, and now the new
>>>> versions don't, this just shows that we need a CI to check whether the
>>>> spark support ARM and whether some modification break it.
>>>> I will add a demo job in OpenLab to build spark on ARM and do a simple
>>>> UT test. Later I will give the job link.
>>>>
>>>> Let me know what you think.
>>>>
>>>> Thank you all!
>>>>
>>>>
>>>> On Wed, Jun 19, 2019 at 8:47 PM Sean Owen  wrote:
>>>>
>>>>> I'd begin by reporting and fixing ARM-related issues in the build. If
>>>>> they're small, of course we should do them. If it requires significant
>>>>> modifications, we can discuss how much Spark can support ARM. I don't
>>>>> think it's yet necessary for the Spark project to run these CI builds
>>>>> until that point, but it's always welcome if people are testing that
>>>>> separately.
>>>>>
>>>>> On Wed, Jun 19, 2019 at 7:41 AM Holden Karau 
>>>>> wrote:
>>>>> >
>>>>> > Moving to dev@ for increased visibility among the developers.
>>>>> >
>>>>> > On Wed, Jun 19, 2019 at 1:24 AM Tianhua huang <
>>>>> huangtianhua...@gmail.com> wrote:
>>>>> >>
>>>>> >> Thanks for your reply.
>>>>> >>
>>>>> >> As I said before, I met some problem of build or test for spark on
>>>>> aarch64 server, so it will be better to have the ARM CI to make sure the
>>>>> spark is compatible for AArch64 platforms.
>>>>> >>
>>>>> >> I’m from OpenLab team(https://openlabtesting.org/ ,a community to
>>>>> do open source project testing. And we can support some Arm virtual
>>>>> machines to AMPLab Jenkins, and also we have a developer team that willing
>>>>> to work on this, we willing to maintain build CI jobs and address the CI
>>>>> issues.  What do you think?
>>>>> >>
>>>>> >>
>>>>> >> Thanks for your attention.
>>>>> >>
>>>>> >>
>>>>> >> On Wed, Jun 19, 2019 at 6:39 AM shane knapp 
>>>>> wrote:
>>>>> >>>
>>>>> >>> yeah, we don't have any aarch64 systems for testing...  this has
>>>>> been asked before but is currently pretty low on our priority list as we
>>>>> don't have the hardware.
>>>>> >>>

Re: Ask for ARM CI for spark

2019-06-25 Thread Tianhua huang
I forked apache/spark project and propose a job(
https://github.com/theopenlab/spark/pull/1) for spark building in OpenLab
ARM instance, this is the first step to build spark on ARM,  I can
enable a periodic
job for arm building for apache/spark master if you guys like.  Later I
will run tests for spark. I also willing to be the maintainer of the arm ci
of spark.

Thanks for you attention.

On Thu, Jun 20, 2019 at 10:17 AM Tianhua huang 
wrote:

> Thanks Sean.
>
> I am very happy to hear that the community will put effort to fix the
> ARM-related issues. I'd be happy to help if you like. And could you give
> the trace link of this issue, then I can check it is fixed or not, thank
> you.
> As far as I know the old versions of spark support ARM, and now the new
> versions don't, this just shows that we need a CI to check whether the
> spark support ARM and whether some modification break it.
> I will add a demo job in OpenLab to build spark on ARM and do a simple UT
> test. Later I will give the job link.
>
> Let me know what you think.
>
> Thank you all!
>
>
> On Wed, Jun 19, 2019 at 8:47 PM Sean Owen  wrote:
>
>> I'd begin by reporting and fixing ARM-related issues in the build. If
>> they're small, of course we should do them. If it requires significant
>> modifications, we can discuss how much Spark can support ARM. I don't
>> think it's yet necessary for the Spark project to run these CI builds
>> until that point, but it's always welcome if people are testing that
>> separately.
>>
>> On Wed, Jun 19, 2019 at 7:41 AM Holden Karau 
>> wrote:
>> >
>> > Moving to dev@ for increased visibility among the developers.
>> >
>> > On Wed, Jun 19, 2019 at 1:24 AM Tianhua huang <
>> huangtianhua...@gmail.com> wrote:
>> >>
>> >> Thanks for your reply.
>> >>
>> >> As I said before, I met some problem of build or test for spark on
>> aarch64 server, so it will be better to have the ARM CI to make sure the
>> spark is compatible for AArch64 platforms.
>> >>
>> >> I’m from OpenLab team(https://openlabtesting.org/ ,a community to do
>> open source project testing. And we can support some Arm virtual machines
>> to AMPLab Jenkins, and also we have a developer team that willing to work
>> on this, we willing to maintain build CI jobs and address the CI issues.
>> What do you think?
>> >>
>> >>
>> >> Thanks for your attention.
>> >>
>> >>
>> >> On Wed, Jun 19, 2019 at 6:39 AM shane knapp 
>> wrote:
>> >>>
>> >>> yeah, we don't have any aarch64 systems for testing...  this has been
>> asked before but is currently pretty low on our priority list as we don't
>> have the hardware.
>> >>>
>> >>> sorry,
>> >>>
>> >>> shane
>> >>>
>> >>> On Mon, Jun 10, 2019 at 7:08 PM Tianhua huang <
>> huangtianhua...@gmail.com> wrote:
>> >>>>
>> >>>> Hi, sorry to disturb you.
>> >>>> The CI testing for apache spark is supported by AMPLab Jenkins, and
>> I find there are some computers(most of them are Linux (amd64) arch) for
>> the CI development, but seems there is no Aarch64 computer for spark CI
>> testing. Recently, I build and run test for spark(master and branch-2.4) on
>> my arm server, and unfortunately there are some problems, for example, ut
>> test is failed due to a LEVELDBJNI native package, the details for java
>> test see http://paste.openstack.org/show/752063/ and python test see
>> http://paste.openstack.org/show/752709/
>> >>>> So I have a question about the ARM CI testing for spark, is there
>> any plan to support it? Thank you very much and I will wait for your reply!
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Shane Knapp
>> >>> UC Berkeley EECS Research / RISELab Staff Technical Lead
>> >>> https://rise.cs.berkeley.edu
>> >
>> >
>> >
>> > --
>> > Twitter: https://twitter.com/holdenkarau
>> > Books (Learning Spark, High Performance Spark, etc.):
>> https://amzn.to/2MaRAG9
>> > YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>
>


Re: Ask for ARM CI for spark

2019-06-19 Thread Tianhua huang
Thanks Sean.

I am very happy to hear that the community will put effort to fix the
ARM-related issues. I'd be happy to help if you like. And could you give
the trace link of this issue, then I can check it is fixed or not, thank
you.
As far as I know the old versions of spark support ARM, and now the new
versions don't, this just shows that we need a CI to check whether the
spark support ARM and whether some modification break it.
I will add a demo job in OpenLab to build spark on ARM and do a simple UT
test. Later I will give the job link.

Let me know what you think.

Thank you all!


On Wed, Jun 19, 2019 at 8:47 PM Sean Owen  wrote:

> I'd begin by reporting and fixing ARM-related issues in the build. If
> they're small, of course we should do them. If it requires significant
> modifications, we can discuss how much Spark can support ARM. I don't
> think it's yet necessary for the Spark project to run these CI builds
> until that point, but it's always welcome if people are testing that
> separately.
>
> On Wed, Jun 19, 2019 at 7:41 AM Holden Karau  wrote:
> >
> > Moving to dev@ for increased visibility among the developers.
> >
> > On Wed, Jun 19, 2019 at 1:24 AM Tianhua huang 
> wrote:
> >>
> >> Thanks for your reply.
> >>
> >> As I said before, I met some problem of build or test for spark on
> aarch64 server, so it will be better to have the ARM CI to make sure the
> spark is compatible for AArch64 platforms.
> >>
> >> I’m from OpenLab team(https://openlabtesting.org/ ,a community to do
> open source project testing. And we can support some Arm virtual machines
> to AMPLab Jenkins, and also we have a developer team that willing to work
> on this, we willing to maintain build CI jobs and address the CI issues.
> What do you think?
> >>
> >>
> >> Thanks for your attention.
> >>
> >>
> >> On Wed, Jun 19, 2019 at 6:39 AM shane knapp 
> wrote:
> >>>
> >>> yeah, we don't have any aarch64 systems for testing...  this has been
> asked before but is currently pretty low on our priority list as we don't
> have the hardware.
> >>>
> >>> sorry,
> >>>
> >>> shane
> >>>
> >>> On Mon, Jun 10, 2019 at 7:08 PM Tianhua huang <
> huangtianhua...@gmail.com> wrote:
> >>>>
> >>>> Hi, sorry to disturb you.
> >>>> The CI testing for apache spark is supported by AMPLab Jenkins, and I
> find there are some computers(most of them are Linux (amd64) arch) for the
> CI development, but seems there is no Aarch64 computer for spark CI
> testing. Recently, I build and run test for spark(master and branch-2.4) on
> my arm server, and unfortunately there are some problems, for example, ut
> test is failed due to a LEVELDBJNI native package, the details for java
> test see http://paste.openstack.org/show/752063/ and python test see
> http://paste.openstack.org/show/752709/
> >>>> So I have a question about the ARM CI testing for spark, is there any
> plan to support it? Thank you very much and I will wait for your reply!
> >>>
> >>>
> >>>
> >>> --
> >>> Shane Knapp
> >>> UC Berkeley EECS Research / RISELab Staff Technical Lead
> >>> https://rise.cs.berkeley.edu
> >
> >
> >
> > --
> > Twitter: https://twitter.com/holdenkarau
> > Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9
> > YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>