[jira] [Commented] (MESOS-10224) [test] CSIVersion/StorageLocalResourceProviderTest.OperationUpdate fails.

2021-06-22 Thread Saad Ur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-10224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17367775#comment-17367775
 ] 

Saad Ur Rahman commented on MESOS-10224:


[~cf.natali], I am finally getting around to patching the issue here.

My understanding of the routine is that it parses the linker library to 
generate a vector of library names and paths. It does this by casting memory 
blocks into structs to give them parsable structure.

The failing conditional on line _#227_ is because of the droppings Ubuntu 
leaves at the end of the file. The data pointer should point to the end of the 
file to indicate complete parsing. The following conditional on line _#235_ 
ensures NUL termination.

The solutions I can think of are to adjust for the end of data specifically for 
Ubuntu (if we can) by setting:
{code:java}
data = buffer->size();
{code}
Or we can do this if the data pointer is not at the end:
{code:java}
if ((size_t)(data - buffer->data()) < buffer->size())
{code}
Another thing we can is let it slide if the data pointer is strictly less than 
the buffer size on line _#227_:
{code:java}
if ((size_t)(data - buffer->data()) > buffer->size()) {
  return Error("Invalid format");
}
{code}
What are your thoughts? All of the above are quick adjustments.

> [test] CSIVersion/StorageLocalResourceProviderTest.OperationUpdate fails.
> -
>
> Key: MESOS-10224
> URL: https://issues.apache.org/jira/browse/MESOS-10224
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.11.0
>Reporter: Saad Ur Rahman
>Priority: Major
> Attachments: ld.so.cache
>
>
> *OS:* Ubuntu 21.04
> *Command:*
> {code:java}
> make -j 6 V=0 check{code}
> Fails during the build and test suite run on two different machines with the 
> same OS.
> {code:java}
> 3: [   OK ] CSIVersion/StorageLocalResourceProviderTest.Update/v0 (479 ms)
> 3: [--] 14 tests from CSIVersion/StorageLocalResourceProviderTest 
> (27011 ms total)
> 3: 
> 3: [--] Global test environment tear-down
> 3: [==] 575 tests from 178 test cases ran. (202572 ms total)
> 3: [  PASSED  ] 573 tests.
> 3: [  FAILED  ] 2 tests, listed below:
> 3: [  FAILED  ] LdcacheTest.Parse
> 3: [  FAILED  ] 
> CSIVersion/StorageLocalResourceProviderTest.OperationUpdate/v0, where 
> GetParam() = "v0"
> 3: 
> 3:  2 FAILED TESTS
> 3:   YOU HAVE 34 DISABLED TESTS
> 3: 
> 3: 
> 3: 
> 3: [FAIL]: 4 shard(s) have failed tests
> 3/3 Test #3: MesosTests ...***Failed  1173.43 sec
> {code}
> Are there any pre-requisites required to get the build/tests to pass? I am 
> trying to get all the tests to pass to make sure my build environment is 
> setup correctly for development.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (MESOS-10224) [test] CSIVersion/StorageLocalResourceProviderTest.OperationUpdate fails.

2021-06-22 Thread Saad Ur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-10224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17367609#comment-17367609
 ] 

Saad Ur Rahman commented on MESOS-10224:


[~cf.natali] thank you! I will get the PR in today. This will give me the 
opportunity to get hands-on with the workflow on Apache projects :)

> [test] CSIVersion/StorageLocalResourceProviderTest.OperationUpdate fails.
> -
>
> Key: MESOS-10224
> URL: https://issues.apache.org/jira/browse/MESOS-10224
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.11.0
>Reporter: Saad Ur Rahman
>Priority: Major
> Attachments: ld.so.cache
>
>
> *OS:* Ubuntu 21.04
> *Command:*
> {code:java}
> make -j 6 V=0 check{code}
> Fails during the build and test suite run on two different machines with the 
> same OS.
> {code:java}
> 3: [   OK ] CSIVersion/StorageLocalResourceProviderTest.Update/v0 (479 ms)
> 3: [--] 14 tests from CSIVersion/StorageLocalResourceProviderTest 
> (27011 ms total)
> 3: 
> 3: [--] Global test environment tear-down
> 3: [==] 575 tests from 178 test cases ran. (202572 ms total)
> 3: [  PASSED  ] 573 tests.
> 3: [  FAILED  ] 2 tests, listed below:
> 3: [  FAILED  ] LdcacheTest.Parse
> 3: [  FAILED  ] 
> CSIVersion/StorageLocalResourceProviderTest.OperationUpdate/v0, where 
> GetParam() = "v0"
> 3: 
> 3:  2 FAILED TESTS
> 3:   YOU HAVE 34 DISABLED TESTS
> 3: 
> 3: 
> 3: 
> 3: [FAIL]: 4 shard(s) have failed tests
> 3/3 Test #3: MesosTests ...***Failed  1173.43 sec
> {code}
> Are there any pre-requisites required to get the build/tests to pass? I am 
> trying to get all the tests to pass to make sure my build environment is 
> setup correctly for development.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (MESOS-10224) [test] CSIVersion/StorageLocalResourceProviderTest.OperationUpdate fails.

2021-06-22 Thread Charles Natali (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-10224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17367602#comment-17367602
 ] 

Charles Natali commented on MESOS-10224:


Actually [~surahman] you should go ahead, it's a nice and easy fix!
The problematic code is here: 
https://github.com/apache/mesos/blob/master/src/linux/ldcache.cpp#L227

The code expects that the file ends after the last entry, whereas in your case 
it's not true since there's this description string at the end of the file.


> [test] CSIVersion/StorageLocalResourceProviderTest.OperationUpdate fails.
> -
>
> Key: MESOS-10224
> URL: https://issues.apache.org/jira/browse/MESOS-10224
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.11.0
>Reporter: Saad Ur Rahman
>Priority: Major
> Attachments: ld.so.cache
>
>
> *OS:* Ubuntu 21.04
> *Command:*
> {code:java}
> make -j 6 V=0 check{code}
> Fails during the build and test suite run on two different machines with the 
> same OS.
> {code:java}
> 3: [   OK ] CSIVersion/StorageLocalResourceProviderTest.Update/v0 (479 ms)
> 3: [--] 14 tests from CSIVersion/StorageLocalResourceProviderTest 
> (27011 ms total)
> 3: 
> 3: [--] Global test environment tear-down
> 3: [==] 575 tests from 178 test cases ran. (202572 ms total)
> 3: [  PASSED  ] 573 tests.
> 3: [  FAILED  ] 2 tests, listed below:
> 3: [  FAILED  ] LdcacheTest.Parse
> 3: [  FAILED  ] 
> CSIVersion/StorageLocalResourceProviderTest.OperationUpdate/v0, where 
> GetParam() = "v0"
> 3: 
> 3:  2 FAILED TESTS
> 3:   YOU HAVE 34 DISABLED TESTS
> 3: 
> 3: 
> 3: 
> 3: [FAIL]: 4 shard(s) have failed tests
> 3/3 Test #3: MesosTests ...***Failed  1173.43 sec
> {code}
> Are there any pre-requisites required to get the build/tests to pass? I am 
> trying to get all the tests to pass to make sure my build environment is 
> setup correctly for development.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (MESOS-10224) [test] CSIVersion/StorageLocalResourceProviderTest.OperationUpdate fails.

2021-06-22 Thread Charles Natali (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-10224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17367584#comment-17367584
 ] 

Charles Natali commented on MESOS-10224:


Ah, here's the problem, looks like Ubuntu adds some crap at the end of the 
cache.
Let's look at the end of the file.

Mine - Debian - ends with an entry and then the NUL byte:

{noformat}
cf@thinkpad:~/src/mesos$ hexdump -C /etc/ld.so.cache.default | tail
00014d20  36 00 2f 75 73 72 2f 6c  69 62 2f 6c 69 62 42 4c  |6./usr/lib/libBL|
00014d30  54 6c 69 74 65 2e 32 2e  35 2e 73 6f 2e 38 2e 36  |Tlite.2.5.so.8.6|
00014d40  00 6c 69 62 42 4c 54 2e  32 2e 35 2e 73 6f 2e 38  |.libBLT.2.5.so.8|
00014d50  2e 36 00 2f 75 73 72 2f  6c 69 62 2f 6c 69 62 42  |.6./usr/lib/libB|
00014d60  4c 54 2e 32 2e 35 2e 73  6f 2e 38 2e 36 00 6c 64  |LT.2.5.so.8.6.ld|
00014d70  2d 6c 69 6e 75 78 2d 78  38 36 2d 36 34 2e 73 6f  |-linux-x86-64.so|
00014d80  2e 32 00 2f 6c 69 62 2f  78 38 36 5f 36 34 2d 6c  |.2./lib/x86_64-l|
00014d90  69 6e 75 78 2d 67 6e 75  2f 6c 64 2d 6c 69 6e 75  |inux-gnu/ld-linu|
00014da0  78 2d 78 38 36 2d 36 34  2e 73 6f 2e 32 00|x-x86-64.so.2.|
00014dae
{noformat}


Yours - ends with some random strings at the end:


{noformat}
cf@thinkpad:~/src/mesos$ hexdump -C /etc/ld.so.cache | tail
000130c0  6f 2e 30 2e 30 00 2f 6c  69 62 2f 78 38 36 5f 36  |o.0.0./lib/x86_6|
000130d0  34 2d 6c 69 6e 75 78 2d  67 6e 75 2f 6c 69 62 67  |4-linux-gnu/libg|
000130e0  63 69 2d 31 2e 73 6f 2e  30 2e 30 2e 30 00 00 00  |ci-1.so.0.0.0...|
000130f0  74 21 a4 ea 01 00 00 00  00 00 00 00 00 00 00 00  |t!..|
00013100  08 31 01 00 42 00 00 00  6c 64 63 6f 6e 66 69 67  |.1..B...ldconfig|
00013110  20 28 55 62 75 6e 74 75  20 47 4c 49 42 43 20 32  | (Ubuntu GLIBC 2|
00013120  2e 33 33 2d 30 75 62 75  6e 74 75 35 29 20 72 65  |.33-0ubuntu5) re|
00013130  6c 65 61 73 65 20 72 65  6c 65 61 73 65 20 76 65  |lease release ve|
00013140  72 73 69 6f 6e 20 32 2e  33 33|rsion 2.33|
0001314a
{noformat}

Trivial to fix, give me a minute...

> [test] CSIVersion/StorageLocalResourceProviderTest.OperationUpdate fails.
> -
>
> Key: MESOS-10224
> URL: https://issues.apache.org/jira/browse/MESOS-10224
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.11.0
>Reporter: Saad Ur Rahman
>Priority: Major
> Attachments: ld.so.cache
>
>
> *OS:* Ubuntu 21.04
> *Command:*
> {code:java}
> make -j 6 V=0 check{code}
> Fails during the build and test suite run on two different machines with the 
> same OS.
> {code:java}
> 3: [   OK ] CSIVersion/StorageLocalResourceProviderTest.Update/v0 (479 ms)
> 3: [--] 14 tests from CSIVersion/StorageLocalResourceProviderTest 
> (27011 ms total)
> 3: 
> 3: [--] Global test environment tear-down
> 3: [==] 575 tests from 178 test cases ran. (202572 ms total)
> 3: [  PASSED  ] 573 tests.
> 3: [  FAILED  ] 2 tests, listed below:
> 3: [  FAILED  ] LdcacheTest.Parse
> 3: [  FAILED  ] 
> CSIVersion/StorageLocalResourceProviderTest.OperationUpdate/v0, where 
> GetParam() = "v0"
> 3: 
> 3:  2 FAILED TESTS
> 3:   YOU HAVE 34 DISABLED TESTS
> 3: 
> 3: 
> 3: 
> 3: [FAIL]: 4 shard(s) have failed tests
> 3/3 Test #3: MesosTests ...***Failed  1173.43 sec
> {code}
> Are there any pre-requisites required to get the build/tests to pass? I am 
> trying to get all the tests to pass to make sure my build environment is 
> setup correctly for development.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (MESOS-10224) [test] CSIVersion/StorageLocalResourceProviderTest.OperationUpdate fails.

2021-06-22 Thread Charles Natali (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-10224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17367579#comment-17367579
 ] 

Charles Natali commented on MESOS-10224:


No it should really work, it's a bit strange.
Possible there's something special about your cache, but it looks valid since I 
can parse it using {{ldconfig -p}}.

Shouldn't be too difficult to fix, hopefully.

> [test] CSIVersion/StorageLocalResourceProviderTest.OperationUpdate fails.
> -
>
> Key: MESOS-10224
> URL: https://issues.apache.org/jira/browse/MESOS-10224
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.11.0
>Reporter: Saad Ur Rahman
>Priority: Major
> Attachments: ld.so.cache
>
>
> *OS:* Ubuntu 21.04
> *Command:*
> {code:java}
> make -j 6 V=0 check{code}
> Fails during the build and test suite run on two different machines with the 
> same OS.
> {code:java}
> 3: [   OK ] CSIVersion/StorageLocalResourceProviderTest.Update/v0 (479 ms)
> 3: [--] 14 tests from CSIVersion/StorageLocalResourceProviderTest 
> (27011 ms total)
> 3: 
> 3: [--] Global test environment tear-down
> 3: [==] 575 tests from 178 test cases ran. (202572 ms total)
> 3: [  PASSED  ] 573 tests.
> 3: [  FAILED  ] 2 tests, listed below:
> 3: [  FAILED  ] LdcacheTest.Parse
> 3: [  FAILED  ] 
> CSIVersion/StorageLocalResourceProviderTest.OperationUpdate/v0, where 
> GetParam() = "v0"
> 3: 
> 3:  2 FAILED TESTS
> 3:   YOU HAVE 34 DISABLED TESTS
> 3: 
> 3: 
> 3: 
> 3: [FAIL]: 4 shard(s) have failed tests
> 3/3 Test #3: MesosTests ...***Failed  1173.43 sec
> {code}
> Are there any pre-requisites required to get the build/tests to pass? I am 
> trying to get all the tests to pass to make sure my build environment is 
> setup correctly for development.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (MESOS-10224) [test] CSIVersion/StorageLocalResourceProviderTest.OperationUpdate fails.

2021-06-22 Thread Saad Ur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-10224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17367576#comment-17367576
 ] 

Saad Ur Rahman commented on MESOS-10224:


[~cf.natali] thanks! I will follow along with on this one. I am not 100% 
comfortable making the changes to get this to pass because I do not want to 
break things for other configurations.

> [test] CSIVersion/StorageLocalResourceProviderTest.OperationUpdate fails.
> -
>
> Key: MESOS-10224
> URL: https://issues.apache.org/jira/browse/MESOS-10224
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.11.0
>Reporter: Saad Ur Rahman
>Priority: Major
> Attachments: ld.so.cache
>
>
> *OS:* Ubuntu 21.04
> *Command:*
> {code:java}
> make -j 6 V=0 check{code}
> Fails during the build and test suite run on two different machines with the 
> same OS.
> {code:java}
> 3: [   OK ] CSIVersion/StorageLocalResourceProviderTest.Update/v0 (479 ms)
> 3: [--] 14 tests from CSIVersion/StorageLocalResourceProviderTest 
> (27011 ms total)
> 3: 
> 3: [--] Global test environment tear-down
> 3: [==] 575 tests from 178 test cases ran. (202572 ms total)
> 3: [  PASSED  ] 573 tests.
> 3: [  FAILED  ] 2 tests, listed below:
> 3: [  FAILED  ] LdcacheTest.Parse
> 3: [  FAILED  ] 
> CSIVersion/StorageLocalResourceProviderTest.OperationUpdate/v0, where 
> GetParam() = "v0"
> 3: 
> 3:  2 FAILED TESTS
> 3:   YOU HAVE 34 DISABLED TESTS
> 3: 
> 3: 
> 3: 
> 3: [FAIL]: 4 shard(s) have failed tests
> 3/3 Test #3: MesosTests ...***Failed  1173.43 sec
> {code}
> Are there any pre-requisites required to get the build/tests to pass? I am 
> trying to get all the tests to pass to make sure my build environment is 
> setup correctly for development.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (MESOS-10224) [test] CSIVersion/StorageLocalResourceProviderTest.OperationUpdate fails.

2021-06-22 Thread Charles Natali (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-10224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17367572#comment-17367572
 ] 

Charles Natali commented on MESOS-10224:


Interesting, I can reproduce it - I'll have a look.

> [test] CSIVersion/StorageLocalResourceProviderTest.OperationUpdate fails.
> -
>
> Key: MESOS-10224
> URL: https://issues.apache.org/jira/browse/MESOS-10224
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.11.0
>Reporter: Saad Ur Rahman
>Priority: Major
> Attachments: ld.so.cache
>
>
> *OS:* Ubuntu 21.04
> *Command:*
> {code:java}
> make -j 6 V=0 check{code}
> Fails during the build and test suite run on two different machines with the 
> same OS.
> {code:java}
> 3: [   OK ] CSIVersion/StorageLocalResourceProviderTest.Update/v0 (479 ms)
> 3: [--] 14 tests from CSIVersion/StorageLocalResourceProviderTest 
> (27011 ms total)
> 3: 
> 3: [--] Global test environment tear-down
> 3: [==] 575 tests from 178 test cases ran. (202572 ms total)
> 3: [  PASSED  ] 573 tests.
> 3: [  FAILED  ] 2 tests, listed below:
> 3: [  FAILED  ] LdcacheTest.Parse
> 3: [  FAILED  ] 
> CSIVersion/StorageLocalResourceProviderTest.OperationUpdate/v0, where 
> GetParam() = "v0"
> 3: 
> 3:  2 FAILED TESTS
> 3:   YOU HAVE 34 DISABLED TESTS
> 3: 
> 3: 
> 3: 
> 3: [FAIL]: 4 shard(s) have failed tests
> 3/3 Test #3: MesosTests ...***Failed  1173.43 sec
> {code}
> Are there any pre-requisites required to get the build/tests to pass? I am 
> trying to get all the tests to pass to make sure my build environment is 
> setup correctly for development.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (MESOS-10224) [test] CSIVersion/StorageLocalResourceProviderTest.OperationUpdate fails.

2021-06-22 Thread Saad Ur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-10224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17367567#comment-17367567
 ] 

Saad Ur Rahman edited comment on MESOS-10224 at 6/22/21, 6:07 PM:
--

[~qianzhang] thanks for getting back to me. My upstream is up to date and I am 
building on the main branch - still no joy. I ran a clean git clone, build, and 
then test with the same error. I am attaching my [^ld.so.cache] for inspection, 
if it helps.


was (Author: surahman):
[~qianzhang] thanks for getting back to me. My upstream is up to date and I am 
building on the main branch - still no joy. I ran a clean git clone, build, and 
then test with the same error. I am attaching my [^ld.so.cache] for inspection, 
if it helps.

 

 

> [test] CSIVersion/StorageLocalResourceProviderTest.OperationUpdate fails.
> -
>
> Key: MESOS-10224
> URL: https://issues.apache.org/jira/browse/MESOS-10224
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.11.0
>Reporter: Saad Ur Rahman
>Priority: Major
> Attachments: ld.so.cache
>
>
> *OS:* Ubuntu 21.04
> *Command:*
> {code:java}
> make -j 6 V=0 check{code}
> Fails during the build and test suite run on two different machines with the 
> same OS.
> {code:java}
> 3: [   OK ] CSIVersion/StorageLocalResourceProviderTest.Update/v0 (479 ms)
> 3: [--] 14 tests from CSIVersion/StorageLocalResourceProviderTest 
> (27011 ms total)
> 3: 
> 3: [--] Global test environment tear-down
> 3: [==] 575 tests from 178 test cases ran. (202572 ms total)
> 3: [  PASSED  ] 573 tests.
> 3: [  FAILED  ] 2 tests, listed below:
> 3: [  FAILED  ] LdcacheTest.Parse
> 3: [  FAILED  ] 
> CSIVersion/StorageLocalResourceProviderTest.OperationUpdate/v0, where 
> GetParam() = "v0"
> 3: 
> 3:  2 FAILED TESTS
> 3:   YOU HAVE 34 DISABLED TESTS
> 3: 
> 3: 
> 3: 
> 3: [FAIL]: 4 shard(s) have failed tests
> 3/3 Test #3: MesosTests ...***Failed  1173.43 sec
> {code}
> Are there any pre-requisites required to get the build/tests to pass? I am 
> trying to get all the tests to pass to make sure my build environment is 
> setup correctly for development.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (MESOS-10224) [test] CSIVersion/StorageLocalResourceProviderTest.OperationUpdate fails.

2021-06-22 Thread Saad Ur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-10224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17367567#comment-17367567
 ] 

Saad Ur Rahman commented on MESOS-10224:


[~qianzhang] thanks for getting back to me. My upstream is up to date and I am 
building on the main branch - still no joy. I ran a clean git clone, build, and 
then test with the same error. I am attaching my [^ld.so.cache] for inspection, 
if it helps.

 

 

> [test] CSIVersion/StorageLocalResourceProviderTest.OperationUpdate fails.
> -
>
> Key: MESOS-10224
> URL: https://issues.apache.org/jira/browse/MESOS-10224
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.11.0
>Reporter: Saad Ur Rahman
>Priority: Major
> Attachments: ld.so.cache
>
>
> *OS:* Ubuntu 21.04
> *Command:*
> {code:java}
> make -j 6 V=0 check{code}
> Fails during the build and test suite run on two different machines with the 
> same OS.
> {code:java}
> 3: [   OK ] CSIVersion/StorageLocalResourceProviderTest.Update/v0 (479 ms)
> 3: [--] 14 tests from CSIVersion/StorageLocalResourceProviderTest 
> (27011 ms total)
> 3: 
> 3: [--] Global test environment tear-down
> 3: [==] 575 tests from 178 test cases ran. (202572 ms total)
> 3: [  PASSED  ] 573 tests.
> 3: [  FAILED  ] 2 tests, listed below:
> 3: [  FAILED  ] LdcacheTest.Parse
> 3: [  FAILED  ] 
> CSIVersion/StorageLocalResourceProviderTest.OperationUpdate/v0, where 
> GetParam() = "v0"
> 3: 
> 3:  2 FAILED TESTS
> 3:   YOU HAVE 34 DISABLED TESTS
> 3: 
> 3: 
> 3: 
> 3: [FAIL]: 4 shard(s) have failed tests
> 3/3 Test #3: MesosTests ...***Failed  1173.43 sec
> {code}
> Are there any pre-requisites required to get the build/tests to pass? I am 
> trying to get all the tests to pass to make sure my build environment is 
> setup correctly for development.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (MESOS-10223) Test failures on Linux ARM64

2021-06-22 Thread Charles Natali (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-10223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17367559#comment-17367559
 ] 

Charles Natali commented on MESOS-10223:


By the way, the reason for running it as root is that many tests are only run 
as root (e.g. tests which need cgroups etc), so it'd be nice to make sure they 
pass.

> Test failures on Linux ARM64
> 
>
> Key: MESOS-10223
> URL: https://issues.apache.org/jira/browse/MESOS-10223
> Project: Mesos
>  Issue Type: Bug
>Reporter: Martin Tzvetanov Grigorov
>Priority: Major
> Attachments: 0001-Fixed-crashes-on-ARM64-due-to-libunwind.patch, 
> mesos-on-arm64.tgz
>
>
> Running `make check` on Ubuntu 20.04.2 aarch64 fails with such errors:
>  
> {code:java}
>  [--] 3 tests from JsonTest
> [ RUN  ] JsonTest.NumberFormat
> [   OK ] JsonTest.NumberFormat (0 ms)
> [ RUN  ] JsonTest.Find
> terminate called after throwing an instance of 
> 'boost::exception_detail::clone_impl
>  >'
> terminate called recursively
> *** Aborted at 1622796321 (unix time) try "date -d @1622796321" if you are 
> using GNU date ***
> PC: @0x0 (unknown)
> *** SIGABRT (@0x3e8090d) received by PID 2317 (TID 0xa80d9010) from 
> PID 2317; stack trace: ***
> @ 0xa80e77fc ([vdso]+0x7fb)
> @ 0xa7b71188 gsignal
> @ 0xa7b5ddac abort
> @ 0xa7d73848 __gnu_cxx::__verbose_terminate_handler()
> @ 0xa7d711ec (unknown)
> @ 0xa7d71250 std::terminate()
> @ 0xa7d715b0 __cxa_rethrow
> @ 0xa7d737e4 __gnu_cxx::__verbose_terminate_handler()
> @ 0xa7d711ec (unknown)
> @ 0xa7d71250 std::terminate()
> @ 0xa7d71544 __cxa_throw
> @ 0xab4ee114 boost::throw_exception<>()
> @ 0xab5c512c boost::conversion::detail::throw_bad_cast<>()
> @ 0xab5c2228 boost::lexical_cast<>()
> @ 0xab5bf89c numify<>()
> @ 0xab5e00e8 JSON::Object::find<>()
> @ 0xab5e0584 JSON::Object::find<>()
> @ 0xab5e0584 JSON::Object::find<>()
> @ 0xab5cdd2c JsonTest_Find_Test::TestBody()
> @ 0xab886fec 
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
> @ 0xab87f1d4 
> testing::internal::HandleExceptionsInMethodIfSupported<>()
> @ 0xab85a9d0 testing::Test::Run()
> @ 0xab85b258 testing::TestInfo::Run()
> @ 0xab85b8d0 testing::TestCase::Run()
> @ 0xab862344 testing::internal::UnitTestImpl::RunAllTests()
> @ 0xab888440 
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
> @ 0xab87ffd4 
> testing::internal::HandleExceptionsInMethodIfSupported<>()
> @ 0xab86100c testing::UnitTest::Run()
> @ 0xab630950 RUN_ALL_TESTS()
> @ 0xab630418 main
> @ 0xa7b5e110 __libc_start_main
> @ 0xab4b41d4 (unknown)
> [FAIL]: 8 shard(s) have failed tests
> make[6]: *** [Makefile:2092: check-local] Error 8
> make[6]: Leaving directory 
> '/home/ubuntu/git/apache/mesos/build/3rdparty/stout'
> make[5]: *** [Makefile:1840: check-am] Error 2
> make[5]: Leaving directory 
> '/home/ubuntu/git/apache/mesos/build/3rdparty/stout'
> make[4]: *** [Makefile:1685: check-recursive] Error 1
> make[4]: Leaving directory 
> '/home/ubuntu/git/apache/mesos/build/3rdparty/stout'
> make[3]: *** [Makefile:1842: check] Error 2
> make[3]: Leaving directory 
> '/home/ubuntu/git/apache/mesos/build/3rdparty/stout'
> make[2]: *** [Makefile:1153: check-recursive] Error 1
> make[2]: Leaving directory '/home/ubuntu/git/apache/mesos/build/3rdparty'
> make[1]: *** [Makefile:1306: check] Error 2
> make[1]: Leaving directory '/home/ubuntu/git/apache/mesos/build/3rdparty'
> make: *** [Makefile:785: check-recursive] Error 1
> {code}
>  
> {code:java}
> [--] 3 tests from JsonTest
> [ RUN  ] JsonTest.InvalidUTF8
> [   OK ] JsonTest.InvalidUTF8 (0 ms)
> [ RUN  ] JsonTest.ParseError
> terminate called after throwing an instance of 'std::overflow_error'
> terminate called recursively
> *** Aborted at 1622796321 (unix time) try "date -d @1622796321" if you are 
> using GNU date ***
> PC: @0x0 (unknown)
> *** SIGABRT (@0x3e8090c) received by PID 2316 (TID 0x918cf010) from 
> PID 2316; stack trace: ***
> @ 0x918dd7fc ([vdso]+0x7fb)
> @ 0x91367188 gsignal
> @ 0x91353dac abort
> @ 0x91569848 __gnu_cxx::__verbose_terminate_handler()
> @ 0x915671ec (unknown)
> @ 0x91567250 std::terminate()
> @ 0x915675b0 __cxa_rethrow
> @ 0x915697e4 __gnu_cxx::__verbose_terminate_handler()
> @ 0x915671ec (unknown)
> @ 0x91567250 std::terminate()
>   

[jira] [Commented] (MESOS-10223) Test failures on Linux ARM64

2021-06-22 Thread Charles Natali (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-10223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17367554#comment-17367554
 ] 

Charles Natali commented on MESOS-10223:


Hey [~mgrigorov]

The attached patch should fix the issue - I ran all the test suite and it 
pretty much passed, however it would be great if you could run it as root with 
the attached patch, just to make sure 
[^0001-Fixed-crashes-on-ARM64-due-to-libunwind.patch] 

There might be some unrelated/transient error though but I'm just interested to 
see that this problem is fixed.

Thanks!

> Test failures on Linux ARM64
> 
>
> Key: MESOS-10223
> URL: https://issues.apache.org/jira/browse/MESOS-10223
> Project: Mesos
>  Issue Type: Bug
>Reporter: Martin Tzvetanov Grigorov
>Priority: Major
> Attachments: 0001-Fixed-crashes-on-ARM64-due-to-libunwind.patch, 
> mesos-on-arm64.tgz
>
>
> Running `make check` on Ubuntu 20.04.2 aarch64 fails with such errors:
>  
> {code:java}
>  [--] 3 tests from JsonTest
> [ RUN  ] JsonTest.NumberFormat
> [   OK ] JsonTest.NumberFormat (0 ms)
> [ RUN  ] JsonTest.Find
> terminate called after throwing an instance of 
> 'boost::exception_detail::clone_impl
>  >'
> terminate called recursively
> *** Aborted at 1622796321 (unix time) try "date -d @1622796321" if you are 
> using GNU date ***
> PC: @0x0 (unknown)
> *** SIGABRT (@0x3e8090d) received by PID 2317 (TID 0xa80d9010) from 
> PID 2317; stack trace: ***
> @ 0xa80e77fc ([vdso]+0x7fb)
> @ 0xa7b71188 gsignal
> @ 0xa7b5ddac abort
> @ 0xa7d73848 __gnu_cxx::__verbose_terminate_handler()
> @ 0xa7d711ec (unknown)
> @ 0xa7d71250 std::terminate()
> @ 0xa7d715b0 __cxa_rethrow
> @ 0xa7d737e4 __gnu_cxx::__verbose_terminate_handler()
> @ 0xa7d711ec (unknown)
> @ 0xa7d71250 std::terminate()
> @ 0xa7d71544 __cxa_throw
> @ 0xab4ee114 boost::throw_exception<>()
> @ 0xab5c512c boost::conversion::detail::throw_bad_cast<>()
> @ 0xab5c2228 boost::lexical_cast<>()
> @ 0xab5bf89c numify<>()
> @ 0xab5e00e8 JSON::Object::find<>()
> @ 0xab5e0584 JSON::Object::find<>()
> @ 0xab5e0584 JSON::Object::find<>()
> @ 0xab5cdd2c JsonTest_Find_Test::TestBody()
> @ 0xab886fec 
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
> @ 0xab87f1d4 
> testing::internal::HandleExceptionsInMethodIfSupported<>()
> @ 0xab85a9d0 testing::Test::Run()
> @ 0xab85b258 testing::TestInfo::Run()
> @ 0xab85b8d0 testing::TestCase::Run()
> @ 0xab862344 testing::internal::UnitTestImpl::RunAllTests()
> @ 0xab888440 
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
> @ 0xab87ffd4 
> testing::internal::HandleExceptionsInMethodIfSupported<>()
> @ 0xab86100c testing::UnitTest::Run()
> @ 0xab630950 RUN_ALL_TESTS()
> @ 0xab630418 main
> @ 0xa7b5e110 __libc_start_main
> @ 0xab4b41d4 (unknown)
> [FAIL]: 8 shard(s) have failed tests
> make[6]: *** [Makefile:2092: check-local] Error 8
> make[6]: Leaving directory 
> '/home/ubuntu/git/apache/mesos/build/3rdparty/stout'
> make[5]: *** [Makefile:1840: check-am] Error 2
> make[5]: Leaving directory 
> '/home/ubuntu/git/apache/mesos/build/3rdparty/stout'
> make[4]: *** [Makefile:1685: check-recursive] Error 1
> make[4]: Leaving directory 
> '/home/ubuntu/git/apache/mesos/build/3rdparty/stout'
> make[3]: *** [Makefile:1842: check] Error 2
> make[3]: Leaving directory 
> '/home/ubuntu/git/apache/mesos/build/3rdparty/stout'
> make[2]: *** [Makefile:1153: check-recursive] Error 1
> make[2]: Leaving directory '/home/ubuntu/git/apache/mesos/build/3rdparty'
> make[1]: *** [Makefile:1306: check] Error 2
> make[1]: Leaving directory '/home/ubuntu/git/apache/mesos/build/3rdparty'
> make: *** [Makefile:785: check-recursive] Error 1
> {code}
>  
> {code:java}
> [--] 3 tests from JsonTest
> [ RUN  ] JsonTest.InvalidUTF8
> [   OK ] JsonTest.InvalidUTF8 (0 ms)
> [ RUN  ] JsonTest.ParseError
> terminate called after throwing an instance of 'std::overflow_error'
> terminate called recursively
> *** Aborted at 1622796321 (unix time) try "date -d @1622796321" if you are 
> using GNU date ***
> PC: @0x0 (unknown)
> *** SIGABRT (@0x3e8090c) received by PID 2316 (TID 0x918cf010) from 
> PID 2316; stack trace: ***
> @ 0x918dd7fc ([vdso]+0x7fb)
> @ 0x91367188 gsignal
> @ 0x91353dac abort
> @ 0x91569848 __gnu_cxx::__verbose_terminate_handler()
> @ 0x915671ec (unknown)
> @ 

[jira] [Commented] (MESOS-10224) [test] CSIVersion/StorageLocalResourceProviderTest.OperationUpdate fails.

2021-06-22 Thread Qian Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-10224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17367313#comment-17367313
 ] 

Qian Zhang commented on MESOS-10224:


[~surahman] I think it has been fixed in this PR 
([https://github.com/apache/mesos/pull/384)] by [~cf.natali] recently, can you 
please get the latest Mesos code and try again?

> [test] CSIVersion/StorageLocalResourceProviderTest.OperationUpdate fails.
> -
>
> Key: MESOS-10224
> URL: https://issues.apache.org/jira/browse/MESOS-10224
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.11.0
>Reporter: Saad Ur Rahman
>Priority: Major
>
> *OS:* Ubuntu 21.04
> *Command:*
> {code:java}
> make -j 6 V=0 check{code}
> Fails during the build and test suite run on two different machines with the 
> same OS.
> {code:java}
> 3: [   OK ] CSIVersion/StorageLocalResourceProviderTest.Update/v0 (479 ms)
> 3: [--] 14 tests from CSIVersion/StorageLocalResourceProviderTest 
> (27011 ms total)
> 3: 
> 3: [--] Global test environment tear-down
> 3: [==] 575 tests from 178 test cases ran. (202572 ms total)
> 3: [  PASSED  ] 573 tests.
> 3: [  FAILED  ] 2 tests, listed below:
> 3: [  FAILED  ] LdcacheTest.Parse
> 3: [  FAILED  ] 
> CSIVersion/StorageLocalResourceProviderTest.OperationUpdate/v0, where 
> GetParam() = "v0"
> 3: 
> 3:  2 FAILED TESTS
> 3:   YOU HAVE 34 DISABLED TESTS
> 3: 
> 3: 
> 3: 
> 3: [FAIL]: 4 shard(s) have failed tests
> 3/3 Test #3: MesosTests ...***Failed  1173.43 sec
> {code}
> Are there any pre-requisites required to get the build/tests to pass? I am 
> trying to get all the tests to pass to make sure my build environment is 
> setup correctly for development.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (MESOS-10223) Test failures on Linux ARM64

2021-06-22 Thread Martin Tzvetanov Grigorov (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-10223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17367100#comment-17367100
 ] 

Martin Tzvetanov Grigorov commented on MESOS-10223:
---

{quote}if you look at the error it's {{ENOSYS}} not {{EPERM}}.
{quote}
You are right!

 

I've sent you the credentials to the email that you use in your Git commits.

> Test failures on Linux ARM64
> 
>
> Key: MESOS-10223
> URL: https://issues.apache.org/jira/browse/MESOS-10223
> Project: Mesos
>  Issue Type: Bug
>Reporter: Martin Tzvetanov Grigorov
>Priority: Major
> Attachments: mesos-on-arm64.tgz
>
>
> Running `make check` on Ubuntu 20.04.2 aarch64 fails with such errors:
>  
> {code:java}
>  [--] 3 tests from JsonTest
> [ RUN  ] JsonTest.NumberFormat
> [   OK ] JsonTest.NumberFormat (0 ms)
> [ RUN  ] JsonTest.Find
> terminate called after throwing an instance of 
> 'boost::exception_detail::clone_impl
>  >'
> terminate called recursively
> *** Aborted at 1622796321 (unix time) try "date -d @1622796321" if you are 
> using GNU date ***
> PC: @0x0 (unknown)
> *** SIGABRT (@0x3e8090d) received by PID 2317 (TID 0xa80d9010) from 
> PID 2317; stack trace: ***
> @ 0xa80e77fc ([vdso]+0x7fb)
> @ 0xa7b71188 gsignal
> @ 0xa7b5ddac abort
> @ 0xa7d73848 __gnu_cxx::__verbose_terminate_handler()
> @ 0xa7d711ec (unknown)
> @ 0xa7d71250 std::terminate()
> @ 0xa7d715b0 __cxa_rethrow
> @ 0xa7d737e4 __gnu_cxx::__verbose_terminate_handler()
> @ 0xa7d711ec (unknown)
> @ 0xa7d71250 std::terminate()
> @ 0xa7d71544 __cxa_throw
> @ 0xab4ee114 boost::throw_exception<>()
> @ 0xab5c512c boost::conversion::detail::throw_bad_cast<>()
> @ 0xab5c2228 boost::lexical_cast<>()
> @ 0xab5bf89c numify<>()
> @ 0xab5e00e8 JSON::Object::find<>()
> @ 0xab5e0584 JSON::Object::find<>()
> @ 0xab5e0584 JSON::Object::find<>()
> @ 0xab5cdd2c JsonTest_Find_Test::TestBody()
> @ 0xab886fec 
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
> @ 0xab87f1d4 
> testing::internal::HandleExceptionsInMethodIfSupported<>()
> @ 0xab85a9d0 testing::Test::Run()
> @ 0xab85b258 testing::TestInfo::Run()
> @ 0xab85b8d0 testing::TestCase::Run()
> @ 0xab862344 testing::internal::UnitTestImpl::RunAllTests()
> @ 0xab888440 
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
> @ 0xab87ffd4 
> testing::internal::HandleExceptionsInMethodIfSupported<>()
> @ 0xab86100c testing::UnitTest::Run()
> @ 0xab630950 RUN_ALL_TESTS()
> @ 0xab630418 main
> @ 0xa7b5e110 __libc_start_main
> @ 0xab4b41d4 (unknown)
> [FAIL]: 8 shard(s) have failed tests
> make[6]: *** [Makefile:2092: check-local] Error 8
> make[6]: Leaving directory 
> '/home/ubuntu/git/apache/mesos/build/3rdparty/stout'
> make[5]: *** [Makefile:1840: check-am] Error 2
> make[5]: Leaving directory 
> '/home/ubuntu/git/apache/mesos/build/3rdparty/stout'
> make[4]: *** [Makefile:1685: check-recursive] Error 1
> make[4]: Leaving directory 
> '/home/ubuntu/git/apache/mesos/build/3rdparty/stout'
> make[3]: *** [Makefile:1842: check] Error 2
> make[3]: Leaving directory 
> '/home/ubuntu/git/apache/mesos/build/3rdparty/stout'
> make[2]: *** [Makefile:1153: check-recursive] Error 1
> make[2]: Leaving directory '/home/ubuntu/git/apache/mesos/build/3rdparty'
> make[1]: *** [Makefile:1306: check] Error 2
> make[1]: Leaving directory '/home/ubuntu/git/apache/mesos/build/3rdparty'
> make: *** [Makefile:785: check-recursive] Error 1
> {code}
>  
> {code:java}
> [--] 3 tests from JsonTest
> [ RUN  ] JsonTest.InvalidUTF8
> [   OK ] JsonTest.InvalidUTF8 (0 ms)
> [ RUN  ] JsonTest.ParseError
> terminate called after throwing an instance of 'std::overflow_error'
> terminate called recursively
> *** Aborted at 1622796321 (unix time) try "date -d @1622796321" if you are 
> using GNU date ***
> PC: @0x0 (unknown)
> *** SIGABRT (@0x3e8090c) received by PID 2316 (TID 0x918cf010) from 
> PID 2316; stack trace: ***
> @ 0x918dd7fc ([vdso]+0x7fb)
> @ 0x91367188 gsignal
> @ 0x91353dac abort
> @ 0x91569848 __gnu_cxx::__verbose_terminate_handler()
> @ 0x915671ec (unknown)
> @ 0x91567250 std::terminate()
> @ 0x915675b0 __cxa_rethrow
> @ 0x915697e4 __gnu_cxx::__verbose_terminate_handler()
> @ 0x915671ec (unknown)
> @ 0x91567250 std::terminate()
> @ 0x91567544 

[jira] [Comment Edited] (MESOS-10223) Test failures on Linux ARM64

2021-06-22 Thread Charles Natali (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-10223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17367092#comment-17367092
 ] 

Charles Natali edited comment on MESOS-10223 at 6/22/21, 7:51 AM:
--

{quote}I experienced the errors both on real ARM64 host and with Docker. The 
problem with strace is not related to QEMU. It is a Docker thingy. You need to 
add a capability for it:

 

{{docker run --cap-add=SYS_PTRACE}} ...
{quote}
 

Hm, I don't think so, it's not a capability/seccomp issue: if you look at the 
error it's {{ENOSYS}} not {{EPERM}}.

 

Just to show you:
{noformat}
root@thinkpad:/home/cf/mesos-on-arm64# docker run --cap-add=SYS_PTRACE --rm -it 
-v $PWD/mesos:/mesos bui
ld-mesos-on-arm64 bash 
WARNING: The requested image's platform (linux/arm64) does not match the 
detected host platform (linux/amd64) and no specific platform was requested
root@4d45b9e91754:/mesos# apt install strace
Reading package lists... Done
Building dependency tree 
Reading state information... Done
The following NEW packages will be installed:
 strace 
0 upgraded, 1 newly installed, 0 to remove and 6 not upgraded.
Need to get 297 kB of archives.
After this operation, 1336 kB of additional disk space will be used.
Get:1 http://ports.ubuntu.com/ubuntu-ports focal-updates/main arm64 strace 
arm64 5.5-3ubuntu1 [297 kB]
Fetched 297 kB in 1s (327 kB/s)
Selecting previously unselected package strace.
(Reading database ... 18530 files and directories currently installed.)
Preparing to unpack .../strace_5.5-3ubuntu1_arm64.deb ...
Unpacking strace (5.5-3ubuntu1) ...
Setting up strace (5.5-3ubuntu1) ...
root@4d45b9e91754:/mesos# strace ls 
/usr/bin/strace: test_ptrace_get_syscall_info: PTRACE_TRACEME: Function not 
implemented
/usr/bin/strace: ptrace(PTRACE_TRACEME, ...): Function not implemented
/usr/bin/strace: PTRACE_SETOPTIONS: Function not implemented
/usr/bin/strace: detach: waitpid(115): No child processes
/usr/bin/strace: Process 115 detached
{noformat}


Did you test this docker+qemu image from a non ADM64 host?


> I will send you privately credentials to my ARM64 VM where you can debug it 
> without Docker!

Thanks, that'd be much easier


was (Author: cf.natali):
{quote}I experienced the errors both on real ARM64 host and with Docker. The 
problem with strace is not related to QEMU. It is a Docker thingy. You need to 
add a capability for it:

 

{{docker run --cap-add=SYS_PTRACE}} ...
{quote}
 

Hm, I don't think so, it's not a capability/seccomp issue: if you look at the 
error it's {{ENOSYS}} not {{EPERM}}.

 

Just to show you:
{noformat}
root@thinkpad:/home/cf/mesos-on-arm64# docker run --cap-add=SYS_PTRACE --rm -it 
-v $PWD/mesos:/mesos bui
ld-mesos-on-arm64 bash 
WARNING: The requested image's platform (linux/arm64) does not match the 
detected host platform (linux/amd64) and no specific platform was requested
root@4d45b9e91754:/mesos# apt install strace
Reading package lists... Done
Building dependency tree 
Reading state information... Done
The following NEW packages will be installed:
 strace 
0 upgraded, 1 newly installed, 0 to remove and 6 not upgraded.
Need to get 297 kB of archives.
After this operation, 1336 kB of additional disk space will be used.
Get:1 http://ports.ubuntu.com/ubuntu-ports focal-updates/main arm64 strace 
arm64 5.5-3ubuntu1 [297 kB]
Fetched 297 kB in 1s (327 kB/s)
Selecting previously unselected package strace.
(Reading database ... 18530 files and directories currently installed.)
Preparing to unpack .../strace_5.5-3ubuntu1_arm64.deb ...
Unpacking strace (5.5-3ubuntu1) ...
Setting up strace (5.5-3ubuntu1) ...
root@4d45b9e91754:/mesos# strace ls 
/usr/bin/strace: test_ptrace_get_syscall_info: PTRACE_TRACEME: Function not 
implemented
/usr/bin/strace: ptrace(PTRACE_TRACEME, ...): Function not implemented
/usr/bin/strace: PTRACE_SETOPTIONS: Function not implemented
/usr/bin/strace: detach: waitpid(115): No child processes
/usr/bin/strace: Process 115 detached
{noformat}


Did you test this docker+qemu image from a non ADM64 host?


> I will send you privately credentials to my ARM64 VM where you can debug it 
> without Docker!

> Test failures on Linux ARM64
> 
>
> Key: MESOS-10223
> URL: https://issues.apache.org/jira/browse/MESOS-10223
> Project: Mesos
>  Issue Type: Bug
>Reporter: Martin Tzvetanov Grigorov
>Priority: Major
> Attachments: mesos-on-arm64.tgz
>
>
> Running `make check` on Ubuntu 20.04.2 aarch64 fails with such errors:
>  
> {code:java}
>  [--] 3 tests from JsonTest
> [ RUN  ] JsonTest.NumberFormat
> [   OK ] JsonTest.NumberFormat (0 ms)
> [ RUN  ] JsonTest.Find
> terminate called after throwing an instance of 
> 'boost::exception_detail::clone_impl
>  >'
> terminate called recursively
> *** Aborted at 1622796321 (unix time) try "date -d 

[jira] [Commented] (MESOS-10223) Test failures on Linux ARM64

2021-06-22 Thread Charles Natali (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-10223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17367092#comment-17367092
 ] 

Charles Natali commented on MESOS-10223:


{quote}I experienced the errors both on real ARM64 host and with Docker. The 
problem with strace is not related to QEMU. It is a Docker thingy. You need to 
add a capability for it:

 

{{docker run --cap-add=SYS_PTRACE}} ...
{quote}
 

Hm, I don't think so, it's not a capability/seccomp issue: if you look at the 
error it's {{ENOSYS}} not {{EPERM}}.

 

Just to show you:
{noformat}
root@thinkpad:/home/cf/mesos-on-arm64# docker run --cap-add=SYS_PTRACE --rm -it 
-v $PWD/mesos:/mesos bui
ld-mesos-on-arm64 bash 
WARNING: The requested image's platform (linux/arm64) does not match the 
detected host platform (linux/amd64) and no specific platform was requested
root@4d45b9e91754:/mesos# apt install strace
Reading package lists... Done
Building dependency tree 
Reading state information... Done
The following NEW packages will be installed:
 strace 
0 upgraded, 1 newly installed, 0 to remove and 6 not upgraded.
Need to get 297 kB of archives.
After this operation, 1336 kB of additional disk space will be used.
Get:1 http://ports.ubuntu.com/ubuntu-ports focal-updates/main arm64 strace 
arm64 5.5-3ubuntu1 [297 kB]
Fetched 297 kB in 1s (327 kB/s)
Selecting previously unselected package strace.
(Reading database ... 18530 files and directories currently installed.)
Preparing to unpack .../strace_5.5-3ubuntu1_arm64.deb ...
Unpacking strace (5.5-3ubuntu1) ...
Setting up strace (5.5-3ubuntu1) ...
root@4d45b9e91754:/mesos# strace ls 
/usr/bin/strace: test_ptrace_get_syscall_info: PTRACE_TRACEME: Function not 
implemented
/usr/bin/strace: ptrace(PTRACE_TRACEME, ...): Function not implemented
/usr/bin/strace: PTRACE_SETOPTIONS: Function not implemented
/usr/bin/strace: detach: waitpid(115): No child processes
/usr/bin/strace: Process 115 detached
{noformat}


Did you test this docker+qemu image from a non ADM64 host?


> I will send you privately credentials to my ARM64 VM where you can debug it 
> without Docker!

> Test failures on Linux ARM64
> 
>
> Key: MESOS-10223
> URL: https://issues.apache.org/jira/browse/MESOS-10223
> Project: Mesos
>  Issue Type: Bug
>Reporter: Martin Tzvetanov Grigorov
>Priority: Major
> Attachments: mesos-on-arm64.tgz
>
>
> Running `make check` on Ubuntu 20.04.2 aarch64 fails with such errors:
>  
> {code:java}
>  [--] 3 tests from JsonTest
> [ RUN  ] JsonTest.NumberFormat
> [   OK ] JsonTest.NumberFormat (0 ms)
> [ RUN  ] JsonTest.Find
> terminate called after throwing an instance of 
> 'boost::exception_detail::clone_impl
>  >'
> terminate called recursively
> *** Aborted at 1622796321 (unix time) try "date -d @1622796321" if you are 
> using GNU date ***
> PC: @0x0 (unknown)
> *** SIGABRT (@0x3e8090d) received by PID 2317 (TID 0xa80d9010) from 
> PID 2317; stack trace: ***
> @ 0xa80e77fc ([vdso]+0x7fb)
> @ 0xa7b71188 gsignal
> @ 0xa7b5ddac abort
> @ 0xa7d73848 __gnu_cxx::__verbose_terminate_handler()
> @ 0xa7d711ec (unknown)
> @ 0xa7d71250 std::terminate()
> @ 0xa7d715b0 __cxa_rethrow
> @ 0xa7d737e4 __gnu_cxx::__verbose_terminate_handler()
> @ 0xa7d711ec (unknown)
> @ 0xa7d71250 std::terminate()
> @ 0xa7d71544 __cxa_throw
> @ 0xab4ee114 boost::throw_exception<>()
> @ 0xab5c512c boost::conversion::detail::throw_bad_cast<>()
> @ 0xab5c2228 boost::lexical_cast<>()
> @ 0xab5bf89c numify<>()
> @ 0xab5e00e8 JSON::Object::find<>()
> @ 0xab5e0584 JSON::Object::find<>()
> @ 0xab5e0584 JSON::Object::find<>()
> @ 0xab5cdd2c JsonTest_Find_Test::TestBody()
> @ 0xab886fec 
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
> @ 0xab87f1d4 
> testing::internal::HandleExceptionsInMethodIfSupported<>()
> @ 0xab85a9d0 testing::Test::Run()
> @ 0xab85b258 testing::TestInfo::Run()
> @ 0xab85b8d0 testing::TestCase::Run()
> @ 0xab862344 testing::internal::UnitTestImpl::RunAllTests()
> @ 0xab888440 
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
> @ 0xab87ffd4 
> testing::internal::HandleExceptionsInMethodIfSupported<>()
> @ 0xab86100c testing::UnitTest::Run()
> @ 0xab630950 RUN_ALL_TESTS()
> @ 0xab630418 main
> @ 0xa7b5e110 __libc_start_main
> @ 0xab4b41d4 (unknown)
> [FAIL]: 8 shard(s) have failed tests
> make[6]: *** [Makefile:2092: check-local] Error 8
> make[6]: Leaving directory 
> 

[jira] [Commented] (MESOS-10223) Test failures on Linux ARM64

2021-06-22 Thread Martin Tzvetanov Grigorov (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-10223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17367061#comment-17367061
 ] 

Martin Tzvetanov Grigorov commented on MESOS-10223:
---

Hi [~cf.natali] !

I experienced the errors both on real ARM64 host and with Docker. The problem 
with strace is not related to QEMU. It is a Docker thingy. You need to add a 
capability for it:

 

{{docker run --cap-add=SYS_PTRACE}} ...

 

I will send you privately credentials to my ARM64 VM where you can debug it 
without Docker!

> Test failures on Linux ARM64
> 
>
> Key: MESOS-10223
> URL: https://issues.apache.org/jira/browse/MESOS-10223
> Project: Mesos
>  Issue Type: Bug
>Reporter: Martin Tzvetanov Grigorov
>Priority: Major
> Attachments: mesos-on-arm64.tgz
>
>
> Running `make check` on Ubuntu 20.04.2 aarch64 fails with such errors:
>  
> {code:java}
>  [--] 3 tests from JsonTest
> [ RUN  ] JsonTest.NumberFormat
> [   OK ] JsonTest.NumberFormat (0 ms)
> [ RUN  ] JsonTest.Find
> terminate called after throwing an instance of 
> 'boost::exception_detail::clone_impl
>  >'
> terminate called recursively
> *** Aborted at 1622796321 (unix time) try "date -d @1622796321" if you are 
> using GNU date ***
> PC: @0x0 (unknown)
> *** SIGABRT (@0x3e8090d) received by PID 2317 (TID 0xa80d9010) from 
> PID 2317; stack trace: ***
> @ 0xa80e77fc ([vdso]+0x7fb)
> @ 0xa7b71188 gsignal
> @ 0xa7b5ddac abort
> @ 0xa7d73848 __gnu_cxx::__verbose_terminate_handler()
> @ 0xa7d711ec (unknown)
> @ 0xa7d71250 std::terminate()
> @ 0xa7d715b0 __cxa_rethrow
> @ 0xa7d737e4 __gnu_cxx::__verbose_terminate_handler()
> @ 0xa7d711ec (unknown)
> @ 0xa7d71250 std::terminate()
> @ 0xa7d71544 __cxa_throw
> @ 0xab4ee114 boost::throw_exception<>()
> @ 0xab5c512c boost::conversion::detail::throw_bad_cast<>()
> @ 0xab5c2228 boost::lexical_cast<>()
> @ 0xab5bf89c numify<>()
> @ 0xab5e00e8 JSON::Object::find<>()
> @ 0xab5e0584 JSON::Object::find<>()
> @ 0xab5e0584 JSON::Object::find<>()
> @ 0xab5cdd2c JsonTest_Find_Test::TestBody()
> @ 0xab886fec 
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
> @ 0xab87f1d4 
> testing::internal::HandleExceptionsInMethodIfSupported<>()
> @ 0xab85a9d0 testing::Test::Run()
> @ 0xab85b258 testing::TestInfo::Run()
> @ 0xab85b8d0 testing::TestCase::Run()
> @ 0xab862344 testing::internal::UnitTestImpl::RunAllTests()
> @ 0xab888440 
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
> @ 0xab87ffd4 
> testing::internal::HandleExceptionsInMethodIfSupported<>()
> @ 0xab86100c testing::UnitTest::Run()
> @ 0xab630950 RUN_ALL_TESTS()
> @ 0xab630418 main
> @ 0xa7b5e110 __libc_start_main
> @ 0xab4b41d4 (unknown)
> [FAIL]: 8 shard(s) have failed tests
> make[6]: *** [Makefile:2092: check-local] Error 8
> make[6]: Leaving directory 
> '/home/ubuntu/git/apache/mesos/build/3rdparty/stout'
> make[5]: *** [Makefile:1840: check-am] Error 2
> make[5]: Leaving directory 
> '/home/ubuntu/git/apache/mesos/build/3rdparty/stout'
> make[4]: *** [Makefile:1685: check-recursive] Error 1
> make[4]: Leaving directory 
> '/home/ubuntu/git/apache/mesos/build/3rdparty/stout'
> make[3]: *** [Makefile:1842: check] Error 2
> make[3]: Leaving directory 
> '/home/ubuntu/git/apache/mesos/build/3rdparty/stout'
> make[2]: *** [Makefile:1153: check-recursive] Error 1
> make[2]: Leaving directory '/home/ubuntu/git/apache/mesos/build/3rdparty'
> make[1]: *** [Makefile:1306: check] Error 2
> make[1]: Leaving directory '/home/ubuntu/git/apache/mesos/build/3rdparty'
> make: *** [Makefile:785: check-recursive] Error 1
> {code}
>  
> {code:java}
> [--] 3 tests from JsonTest
> [ RUN  ] JsonTest.InvalidUTF8
> [   OK ] JsonTest.InvalidUTF8 (0 ms)
> [ RUN  ] JsonTest.ParseError
> terminate called after throwing an instance of 'std::overflow_error'
> terminate called recursively
> *** Aborted at 1622796321 (unix time) try "date -d @1622796321" if you are 
> using GNU date ***
> PC: @0x0 (unknown)
> *** SIGABRT (@0x3e8090c) received by PID 2316 (TID 0x918cf010) from 
> PID 2316; stack trace: ***
> @ 0x918dd7fc ([vdso]+0x7fb)
> @ 0x91367188 gsignal
> @ 0x91353dac abort
> @ 0x91569848 __gnu_cxx::__verbose_terminate_handler()
> @ 0x915671ec (unknown)
> @ 0x91567250 std::terminate()
> @ 0x915675b0 __cxa_rethrow
> @