[ 
https://issues.apache.org/jira/browse/IMPALA-13015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17843791#comment-17843791
 ] 

ASF subversion and git services commented on IMPALA-13015:
----------------------------------------------------------

Commit 5045f19b5374678c10888376955f2ff5e360ae5b in impala's branch 
refs/heads/branch-4.4.0 from Abhishek Rawat
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=5045f19b5 ]

IMPALA-13015: Dataload fails due to concurrency issue with test.jceks

Move 'hadoop credential' command used for creating test.jceks to
testdata/bin/create-load-data.sh. Earlier it was in bin/load-data.py
which is called in parallel and was causing failures due to race
conditions.

Testing:
- Ran JniFrontendTest#testGetSecretFromKeyStore after data loading and
test ran clean.

Change-Id: I7fbeffc19f2b78c19fee9acf7f96466c8f4f9bcd
Reviewed-on: http://gerrit.cloudera.org:8080/21346
Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
(cherry picked from commit f620e5d5c0bbdb0fd97bac31c7b7439cd13c6d08)


> Dataload fails due to concurrency issue with test.jceks
> -------------------------------------------------------
>
>                 Key: IMPALA-13015
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13015
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Infrastructure
>    Affects Versions: Impala 4.4.0
>            Reporter: Joe McDonnell
>            Assignee: Abhishek Rawat
>            Priority: Major
>              Labels: flaky
>
> When doing dataload locally, it fails with this error:
> {noformat}
> Traceback (most recent call last):
>   File "/home/joemcdonnell/upstream/Impala/bin/load-data.py", line 523, in 
> <module>
>     if __name__ == "__main__": main()
>   File "/home/joemcdonnell/upstream/Impala/bin/load-data.py", line 322, in 
> main
>     os.remove(jceks_path)
> OSError: [Errno 2] No such file or directory: 
> '/home/joemcdonnell/upstream/Impala/testdata/jceks/test.jceks'
> Background task Loading functional-query data (pid 501094) failed.
> {noformat}
> testdata/bin/create-load-data.sh calls bin/load-data.py for functional, 
> TPC-H, and TPC-DS in parallel, so this logic has race conditions:
> {noformat}
>   jceks_path = TESTDATA_JCEKS_DIR + "/test.jceks"
>   if os.path.exists(jceks_path):
>     os.remove(jceks_path){noformat}
> I don't see a specific reason for this to be in bin/load-data.py. It should 
> be moved somewhere else that doesn't run in parallel. One possible location 
> is to add a step in testdata/bin/create-load-data.sh
> This was introduced in 
> [https://github.com/apache/impala/commit/9837637d9342a49288a13a421d4e749818da1432]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to