[ https://issues.apache.org/jira/browse/BAHIR-35?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15390147#comment-15390147 ]
ASF GitHub Bot commented on BAHIR-35: ------------------------------------- Github user deroneriksson commented on the issue: https://github.com/apache/bahir/pull/11 LGTM @ckadner Building spark-streaming-mqtt with current master results in no *.py files in the artifact. After this PR, building spark-streaming-mqtt results in artifact with *.py files at root level of the artifact (__init__.py, dstream.py, and mqtt.py). > Include Python code in the binary jars for use with "--packages ..." > -------------------------------------------------------------------- > > Key: BAHIR-35 > URL: https://issues.apache.org/jira/browse/BAHIR-35 > Project: Bahir > Issue Type: Task > Components: Build > Affects Versions: 2.0.0 > Reporter: Christian Kadner > Original Estimate: 8h > Remaining Estimate: 8h > > Currently, to make use the PySpark code (i.e streaming-mqtt/python) a user > will have to download the jar from Maven central or clone the code from > GitHub and then have to find individual *.py files, create a zip and add that > to the {{spark-submit}} command with the {{--py-files}} option, or, add them > to the {{PYTHONPATH}} when running locally. > If we include the Python code in the binary build (to the jar that gets > uploaded to Maven central), then users need not do any acrobatics besides > using the {{--packages ...}} option. > An example where the Python code is part of the binary jar is the > [GraphFrames|https://spark-packages.org/package/graphframes/graphframes] > package. -- This message was sent by Atlassian JIRA (v6.3.4#6332)