Re: Apache Beam with Hive

2020-02-10 Thread Jean-Baptiste Onofre
Hi,

If you are ok to use Hive via JDBC, you can use JdbcIO and see the example in 
the javadoc:

https://github.com/apache/beam/blob/master/sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java#L82
 


Regards
JB

> Le 11 févr. 2020 à 07:03, Gershi, Noam  a écrit :
> 
> Hi,
>  
> I am searching for a detailed example how to use Apache Beam with Hive and/or 
> Hcatalog?
> Creating tables, inserting data and fetching it…
>  
>  
>   Noam Gershi Software Developer
>   ICG Technology – TLV Lab
>   T: +972 (3) 7405718
>  
>



Apache Beam with Hive

2020-02-10 Thread Gershi, Noam
Hi,

I am searching for a detailed example how to use Apache Beam with Hive and/or 
Hcatalog?
Creating tables, inserting data and fetching it...


  Noam Gershi Software Developer
  ICG Technology - TLV Lab
  T: +972 (3) 7405718

   [http://www.citigroup.com/emeaemailresources/gra30973_EmailSignature.jpg]



Re: Running Beam on Flink

2020-02-10 Thread jincheng sun
Hi Xander,

It will start an external Python worker pool service at the client side
when submitting the job if the environment type is configured as
"LOOPBACK". The operators(which run in docker) of the job will try to
connect to the external Python worker pool service to start the Python SDK
harness. The reason of the latest failure is because the operators which
run in the docker could not connect to the service which is outside of the
docker container(as a result it could not connect to the external Python
worker pool service). I'm sorry that I have also no idea about the best
practise for how to make it works if the job service runs in a docker
container in Mac. Before knowing of the best practise, I guess you could
start up the job service directly on the bare metal.

Best,
Jincheng
-
Twitter: https://twitter.com/sunjincheng121
-


Xander Song  于2020年2月10日周一 上午2:29写道:

> Hi Jincheng,
>
> Thanks for your help. Yes, I am using Mac. Your suggestion allowed me to
> submit the job on port 8099. However, I am now encountering a different
> error message.
>
> ERROR:root:java.net.ConnectException: Connection refused
>
> Traceback (most recent call last):
>
>   File "test_beam_local_flink.py", line 18, in 
>
> | apache_beam.Map(lambda x: print(x))
>
>   File
> "/Users/xander/Projects/flink-test/env/lib/python3.7/site-packages/apache_beam/pipeline.py",
> line 481, in __exit__
>
> self.run().wait_until_finish()
>
>   File
> "/Users/xander/Projects/flink-test/env/lib/python3.7/site-packages/apache_beam/pipeline.py",
> line 461, in run
>
> self._options).run(False)
>
>   File
> "/Users/xander/Projects/flink-test/env/lib/python3.7/site-packages/apache_beam/pipeline.py",
> line 474, in run
>
> return self.runner.run_pipeline(self, self._options)
>
>   File
> "/Users/xander/Projects/flink-test/env/lib/python3.7/site-packages/apache_beam/runners/portability/portable_runner.py",
> line 334, in run_pipeline
>
> result.wait_until_finish()
>
>   File
> "/Users/xander/Projects/flink-test/env/lib/python3.7/site-packages/apache_beam/runners/portability/portable_runner.py",
> line 455, in wait_until_finish
>
> self._job_id, self._state, self._last_error_message()))
>
> RuntimeError: Pipeline
> BeamApp-root-0209182021-fc6ec677_c351e1bb-d428-4475-9910-4544d8ec1de4
> failed in state FAILED: java.net.ConnectException: Connection refused
>
> I've attached a text file containing the output from the terminal running
> the Docker container in case that is informative (it is quite lengthy). Any
> suggestions are appreciated.
>
> Best,
> Xander
>
> On Sun, Feb 9, 2020 at 2:44 AM jincheng sun 
> wrote:
>
>>
>> Hi Xander,
>>
>> Are you using Mac? The option --net=host doesn't work on Mac[1]. Could
>> you try to see if the command
>> `docker run -p 8099:8099 -p 8098:8098 -p 8097:8097
>> apachebeam/flink1.9_job_server:latest` works?
>>
>> [1] https://forums.docker.com/t/should-docker-run-net-host-work/14215/28
>>
>> Best,
>> Jincheng
>> -
>> Twitter: https://twitter.com/sunjincheng121
>> -
>>
>>
>> Xander Song  于2020年2月9日周日 上午11:52写道:
>>
>>> Do you have any suggestions for addressing this issue? I am unsure of
>>> what to try next.
>>>
>>> On Fri, Feb 7, 2020 at 5:55 PM Ankur Goenka  wrote:
>>>
 That seems to be a problem.

 When I try the command, I get

 $ telnet localhost 8099
 Trying ::1...
 Connected to localhost.
 Escape character is '^]'.
 �^CConnection closed by foreign host.

 On Fri, Feb 7, 2020 at 5:34 PM Xander Song 
 wrote:

> Thanks for the response. After entering telnet localhost 8099, I
> receive
>
> Trying ::1...
>
> telnet: connect to address ::1: Connection refused
>
> Trying 127.0.0.1...
>
> telnet: connect to address 127.0.0.1: Connection refused
>
> telnet: Unable to connect to remote host
>
>
> On Fri, Feb 7, 2020 at 11:41 AM Ankur Goenka 
> wrote:
>
>> Seems that pipeline submission from sdk is not able to reach the job
>> server which was started in docker.
>>
>> Can you try running "telnet localhost 8099" to make sure that
>> pipeline submission can reach the job server.
>>
>> On Thu, Feb 6, 2020 at 8:16 PM Xander Song 
>> wrote:
>>
>>> I am having difficulty following the Python guide for running Beam
>>> on Flink . I
>>> created a virtual environment with Apache Beam installed, then I 
>>> started up
>>> the JobService Docker container with
>>>
>>> docker run --net=host apachebeam/flink1.9_job_server:latest
>>>
>>>
>>> I receive the following message confirming that the container is
>>> running.
>>>
>>>
>>> [main] INFO
>>> org.apache.beam.runners.fnexecution.jobsubmission.JobServerDriver -
>>> ArtifactStagingService started on localhost:8098
>>>
>>> [main] INFO
>>