[
https://issues.apache.org/jira/browse/BEAM-9509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Anonymous updated BEAM-9509:
----------------------------
Status: Triage Needed (was: Resolved)
> Subprocess job server treats missing local file as remote URL
> -------------------------------------------------------------
>
> Key: BEAM-9509
> URL: https://issues.apache.org/jira/browse/BEAM-9509
> Project: Beam
> Issue Type: Bug
> Components: runner-spark
> Reporter: Kyle Weaver
> Assignee: Kyle Weaver
> Priority: P2
> Labels: portability-spark
> Fix For: 2.21.0
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> When the job server jar requested (e.g. by portableWordCountSparkRunnerBatch)
> is missing (such as when it hasn't yet been built), the error message is
> misleading. Expected behavior is that the jar is recognized as a local file,
> and a message is printed instructing the user to build it.
> INFO:apache_beam.utils.subprocess_server:Downloading job server jar from
> /usr/local/google/home/kcweaver/go/src/github.com/apache/beam/runners/spark/job-server/build/libs/beam-runners-spark-job-server-2.21.0-SNAPSHOT.jar
> Traceback (most recent call last):
> File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
> "__main__", mod_spec)
> File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
> exec(code, run_globals)
> File
> "/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/examples/wordcount.py",
> line 142, in <module>
> run()
> File
> "/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/examples/wordcount.py",
> line 121, in run
> result = p.run()
> File
> "/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/pipeline.py",
> line 495, in run
> self._options).run(False)
> File
> "/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/pipeline.py",
> line 508, in run
> return self.runner.run_pipeline(self, self._options)
> File
> "/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/runners/portability/spark_runner.py",
> line 45, in run_pipeline
> return super(SparkRunner, self).run_pipeline(pipeline, options)
> File
> "/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/runners/portability/portable_runner.py",
> line 386, in run_pipeline
> job_service_handle = self.create_job_service(options)
> File
> "/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/runners/portability/portable_runner.py",
> line 293, in create_job_service
> return JobServiceHandle(server.start(), options)
> File
> "/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/runners/portability/job_server.py",
> line 86, in start
> self._endpoint = self._job_server.start()
> File
> "/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/runners/portability/job_server.py",
> line 111, in start
> cmd, endpoint = self.subprocess_cmd_and_endpoint()
> File
> "/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/runners/portability/job_server.py",
> line 156, in subprocess_cmd_and_endpoint
> jar_path = self.local_jar(self.path_to_jar())
> File
> "/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/runners/portability/job_server.py",
> line 153, in local_jar
> return subprocess_server.JavaJarServer.local_jar(url)
> File
> "/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/utils/subprocess_server.py",
> line 206, in local_jar
> url_read = urlopen(url)
> File "/usr/lib/python3.7/urllib/request.py", line 222, in urlopen
> return opener.open(url, data, timeout)
> File "/usr/lib/python3.7/urllib/request.py", line 510, in open
> req = Request(fullurl, data)
> File "/usr/lib/python3.7/urllib/request.py", line 328, in __init__
> self.full_url = url
> File "/usr/lib/python3.7/urllib/request.py", line 354, in full_url
> self._parse()
> File "/usr/lib/python3.7/urllib/request.py", line 383, in _parse
> raise ValueError("unknown url type: %r" % self.full_url)
> ValueError: unknown url type:
> '/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/runners/spark/job-server/build/libs/beam-runners-spark-job-server-2.21.0-SNAPSHOT.jar'
--
This message was sent by Atlassian Jira
(v8.20.10#820010)