Re: Flink 1.10 Out of memory

2020-04-24 Thread Som Lima
@Zahir

what the Apache means is don't  be like Jesse Anderson who recommended
Flink on the basis Apache only uses  maps as seen in video.

While Flink uses ValueState and State in Streaming API.

Although it appears Jesse Anderson only looked as deep as the data stream
helloworld.
You are required to think and look deeper.

https://youtu.be/sYlbD_OoHhs

Watch "Airbus makes more of the sky with Flink - Jesse Anderson & Hassene
Ben Salem" on YouTube


https://ci.apache.org/projects/flink/flink-docs-release-1.10/getting-started/walkthroughs/datastream_api.html



On Fri, 24 Apr 2020, 17:38 Zahid Rahman,  wrote:

> > " a simple Stack Overflow search yields.
> "
> Actually it wasn't stack overflow but  a video I saw presented by Robert
> Metzger. of Apache Flink org.
>
> Your mind  must have been fogged up
> with another thought of another email not the contents of my email clearly.
>
> He explained  the very many solutions to the out of memory error.
>
> Obviously I cant dig any deeper unless I have the code in front of me
> loaded into an IDE.
>
> for example I came across flink archetypes from Alibaba etc. in Eclipse.
>
> I got  every conceivable possible error I have never seen before.
>
> I used google and StackOverFlow  to solve each error.
>
> It took me about 6 hours but I finally have those archetypes working now.
>
> Also I noticed flakey behaviour from IntelliJ when using flink examples
> provided from git hub.
>
> so I loaded same flink examples into Eclipse and and NetBeans  saw same
> flakey behaviour was not present.
>
> I concluded that flakey behaviour was due to intelliJ so I am continuing
> to spend time on Flink and haven't deleted it yet.
>
> I can replicate the IntelliJ  flakey behaviour for the right price.
>
>
> That is software development as I understand it.
>
> Obviously you have different views that you can debug using emailing list.
>
> Unlike you that  skill of software debugging by email I do not have so I
> will not  "chime" any more. Nor can I read the mind of another on what is
> his skill level or product framework familiarity.
>
> You can have all the glory of chiming.
>
> But do keep in mind it was a YouTube video and not  StackOverFlow which is
> mostly a text based website where other people  who are self reliant use it
> to address buggy software.
>
> I am quite happy to use the crying pains of others before me on stack
> overflow to resolve the same software bugs.
>
> It is my view that stack overflow is a partner program with Apache
> frameworks . How did we develop software before google or StackOverFlow  or
> mailing lists ?
>
> I would have to say it was with comprehensive product documents and
> makeuse.org of  software development tools. Mainly an understanding that
> software development is tight binding of teaceable logic flow.
>
> Absolutely no magic except in the case of intermittent error may be.
>
> That was aong winded personal attack so this is a long winded
> explanation.
>
>
> I too am a member of a soon to be extinct tribe , can I be apache too  ?
>
> Happy  Hunting of Apaches  :).
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Fri, 24 Apr 2020, 13:54 Stephan Ewen,  wrote:
>
>> @zahid I would kindly ask you to rethink you approach to posting here.
>> Wanting to help answering questions is appreciated, but what you post is
>> always completely disconnected from the actual issue.
>>
>> The questions here usually go beyond the obvious and beyond what a simple
>> Stack Overflow search yields. That's why they are posted here and the
>> person asking is not simply searching Stack Overflow themselves.
>>
>> So please either make the effort to really dig into the problem and try
>> to understand what is the specific issue, rather than posting unrelated
>> stack overflow links. If you don't want to do that, please stop chiming in.
>>
>>
>> On Fri, Apr 24, 2020 at 1:15 PM Zahid Rahman 
>> wrote:
>>
>>> https://youtu.be/UEkjRN8jRx4  22:10
>>>
>>>
>>> -  one  option is to reduce flink managed memory from default  70% to
>>> may be 50%.
>>>
>>>  - This error could be caused also  due to missing memory ;
>>>
>>> - maintaining a local list by programmer so over using user allocated
>>> memory caused by heavy processing ;
>>>
>>>  - or using a small jvm ,
>>>
>>> - Or JVM  spends too much time on gc.
>>>
>>> Out of memory has nothing to do flink or flink is not at fault.
>>>
>>>
>>> This process is known as "pimping" flink.
>>>
>>> also part of pimping is use to use local disk for memory spill.
>>>
>>> On Fri, 24 Apr 2020, 03:53 Xintong Song,  wrote:
>>>
 @Stephan,
 I don't think so. If JVM hits the direct memory limit, you should see
 the error message "OutOfMemoryError: Direct buffer memory".

 Thank you~

 Xintong Song



 On Thu, Apr 23, 2020 at 6:11 PM Stephan Ewen  wrote:

> @Xintong and @Lasse could it be that the JVM hits the "Direct Memory"
> limit here?
> Would increasing the 

Re: Running 2 instances of LocalExecutionEnvironments with same consumer group.id

2020-04-22 Thread Som Lima
I followed the link , may be same Suraj  is advertising DataBricks webinar
going on live right now.

On Wed, 22 Apr 2020, 18:38 Gary Yao,  wrote:

> Hi Suraj,
>
> This question has been asked before:
>
>
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-kafka-consumers-don-t-honor-group-id-td21054.html
>
> Best,
> Gary
>
> On Wed, Apr 22, 2020 at 6:08 PM Suraj Puvvada  wrote:
> >
> > Hello,
> >
> > I have two JVMs that run LocalExecutionEnvorinments each using the same
> consumer group.id.
> >
> > i noticed that the consumers in each instance has all partitions
> assigned. I was expecting that the partitions will be split across
> consumers across the two JVMs
> >
> > Any help on what might be happening ?
> >
> > Thanks
> > Suraj
>


Re: How to use OpenTSDB as Source?

2020-04-22 Thread Som Lima
Thanks for the  link.


On Wed, 22 Apr 2020, 12:19 Jark Wu,  wrote:

> Hi Som,
>
> You can have a look at ths documentation:
> https://ci.apache.org/projects/flink/flink-docs-master/dev/table/common.html#create-a-tableenvironment
> It describe how to create differnet TableEnvironments based
> on EnvironmentSettings. EnvironmentSettings is a setting to setup a table's
> environment.
> ExecutionEnvironment is the entry point of DataSet,
> and StreamExecutionEnvironment is the entry point of DataStream.
> So they have nothing to do with EnvironmentSettings.
>
> Hi Lucas,
>
> I'm sorry that the documentation misses the piece of how to develop
> connectors for SQL DDL.
> The docs will be refined once the new connector API is ready before 1.11
> release.
>
> If you want to develop a OpenTSDB source for SQL DDL, you should also
> develop a factory implements TableSourceFactory,
> and add the full class path into
> `META_INF/services/org.apache.flink.table.factories.TableFactory` file to
> make it can be discovered by framework.
> You can take `KafkaTableSourceSinkFactory` [1] as an example.
>
> Please let me know if you have other problems.
>
> Best,
> Jark
>
> [1]:
> https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/KafkaTableSourceSinkFactory.java
>
>
> On Wed, 22 Apr 2020 at 17:51, Som Lima  wrote:
>
>> For sake of brevity the code example  does not show the complete code for
>> setting up the environment using EnvironmentSettings class
>>
>> EnvironmentSettings settings = EnvironmentSettings.newInstance()...
>>
>>
>> As you can see comparatively the same protocol is not followed when
>> showing setting up the environment.
>>
>>
>> StreamExecutionEnvironment env = 
>> StreamExecutionEnvironment.getExecutionEnvironment();
>>
>> or
>>
>> ExecutionEnvironment env   = 
>> ExecutionEnvironment.getExecutionEnvironment();BatchTableEnvironment tEnv = 
>> BatchTableEnvironment.create(env);
>>
>> or
>>
>> ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
>>
>> Is there a complete code somewhere  ?
>> Please give me link.
>>
>>
>> [2]
>> https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/table/sql/create.html
>>
>>
>> On Wed, 22 Apr 2020, 09:36 Marta Paes Moreira, 
>> wrote:
>>
>>> Hi, Lucas.
>>>
>>> There was a lot of refactoring in the Table API / SQL in the last
>>> release, so the user experience is not ideal at the moment — sorry for
>>> that.
>>>
>>> You can try using the DDL syntax to create your table, as shown in
>>> [1,2]. I'm CC'ing Timo and Jark, who should be able to help you further.
>>>
>>> Marta
>>>
>>> [1] https://flink.apache.org/news/2020/02/20/ddl.html
>>> [2]
>>> https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/table/sql/create.html
>>>
>>> On Tue, Apr 21, 2020 at 7:02 PM Lucas Kinne <
>>> lucas.ki...@stud-mail.uni-wuerzburg.de> wrote:
>>>
>>>> Hey guys,
>>>>
>>>> in a university project we are storing our collected sensor data in an 
>>>> OpenTSDB
>>>> <http://opentsdb.net/>database.
>>>> I am now trying to use this database as a source in Apache Flink, but I
>>>> can't seem to figure out how to do it.
>>>>
>>>> I have seen that there is no existing connector for this Database, but
>>>> I read in the docs
>>>> <https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/sourceSinks.html>
>>>> that is is possible to implement a custom (Batch/Streaming)TableSource.
>>>> There is a Java client for OpenTSDB
>>>> <http://opentsdb.net/docs/javadoc/net/opentsdb/core/TSDB.html>, which
>>>> could be used for that.
>>>>
>>>> So I created a new Java Class "OpenTSDBTableSource" that implements
>>>> "StreamTableSource", "DefinedProctimeAttribute", "DefinedRowtimeAttribute"
>>>> and "LookupableTableSource", as suggested in the docs.
>>>> However, I have no idea how to register this TableSource. The
>>>> "StreamExecutionEnvironment.addSource" requires a "SourceFunction"
>>>> parameter instead of my "TableSource" and the
>>>> "StreamTableEnvironment.registerTableSource"-Method is deprecated. There is
>>>> a link to the topic "register a TableSource" on linked docs page, but the
>>>> link seems to be dead, hence I found no other method on how to register a
>>>> TableSource.
>>>>
>>>> I could also write a "SourceFunction" myself, pull the OpenTSDB
>>>> database in there and return the DataStream from the fetched Collection,
>>>> but I am not sure whether this is an efficient way.
>>>> And if I did it this "manual" way, how do I avoid pulling the whole
>>>> database everytime?
>>>>
>>>> Any help is much appreciated, even if it is just a small pointer to the
>>>> right direction.
>>>>
>>>> Thanks in advance!
>>>>
>>>> Sincerely,
>>>> Lucas
>>>>
>>>>


Re: How to use OpenTSDB as Source?

2020-04-22 Thread Som Lima
For sake of brevity the code example  does not show the complete code for
setting up the environment using EnvironmentSettings class

EnvironmentSettings settings = EnvironmentSettings.newInstance()...


As you can see comparatively the same protocol is not followed when showing
setting up the environment.


StreamExecutionEnvironment env =
StreamExecutionEnvironment.getExecutionEnvironment();

or

ExecutionEnvironment env   =
ExecutionEnvironment.getExecutionEnvironment();BatchTableEnvironment
tEnv = BatchTableEnvironment.create(env);

or

ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

Is there a complete code somewhere  ?
Please give me link.


[2]
https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/table/sql/create.html


On Wed, 22 Apr 2020, 09:36 Marta Paes Moreira,  wrote:

> Hi, Lucas.
>
> There was a lot of refactoring in the Table API / SQL in the last release,
> so the user experience is not ideal at the moment — sorry for that.
>
> You can try using the DDL syntax to create your table, as shown in [1,2].
> I'm CC'ing Timo and Jark, who should be able to help you further.
>
> Marta
>
> [1] https://flink.apache.org/news/2020/02/20/ddl.html
> [2]
> https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/table/sql/create.html
>
> On Tue, Apr 21, 2020 at 7:02 PM Lucas Kinne <
> lucas.ki...@stud-mail.uni-wuerzburg.de> wrote:
>
>> Hey guys,
>>
>> in a university project we are storing our collected sensor data in an 
>> OpenTSDB
>> database.
>> I am now trying to use this database as a source in Apache Flink, but I
>> can't seem to figure out how to do it.
>>
>> I have seen that there is no existing connector for this Database, but I
>> read in the docs
>> 
>> that is is possible to implement a custom (Batch/Streaming)TableSource.
>> There is a Java client for OpenTSDB
>> , which
>> could be used for that.
>>
>> So I created a new Java Class "OpenTSDBTableSource" that implements
>> "StreamTableSource", "DefinedProctimeAttribute", "DefinedRowtimeAttribute"
>> and "LookupableTableSource", as suggested in the docs.
>> However, I have no idea how to register this TableSource. The
>> "StreamExecutionEnvironment.addSource" requires a "SourceFunction"
>> parameter instead of my "TableSource" and the
>> "StreamTableEnvironment.registerTableSource"-Method is deprecated. There is
>> a link to the topic "register a TableSource" on linked docs page, but the
>> link seems to be dead, hence I found no other method on how to register a
>> TableSource.
>>
>> I could also write a "SourceFunction" myself, pull the OpenTSDB database
>> in there and return the DataStream from the fetched Collection, but I am
>> not sure whether this is an efficient way.
>> And if I did it this "manual" way, how do I avoid pulling the whole
>> database everytime?
>>
>> Any help is much appreciated, even if it is just a small pointer to the
>> right direction.
>>
>> Thanks in advance!
>>
>> Sincerely,
>> Lucas
>>
>


Re: Job manager URI rpc address:port

2020-04-20 Thread Som Lima
This is the code I was looking for,  which will allow me programmatically
to connect to remote jobmanager same as  spark remote master .
The spark master which shares the compute load with slaves , in the case of
flink jobmanager with taskmanagers.


Configuration conf = new Configuration();
conf.setString("mykey","myvalue");final ExecutionEnvironment env =
ExecutionEnvironment.getExecutionEnvironment();
env.getConfig().setGlobalJobParameters(conf);


I found it at the bottom of this page .

https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/batch/index.html




On Sun, 19 Apr 2020, 11:02 tison,  wrote:

> You can change flink-conf.yaml "jobmanager.address" or "jobmanager.port"
> options before run the program or take a look at RemoteStreamEnvironment
> which enables configuring host and port.
>
> Best,
> tison.
>
>
> Som Lima  于2020年4月19日周日 下午5:58写道:
>
>> Hi,
>>
>> After running
>>
>> $ ./bin/start-cluster.sh
>>
>> The following line of code defaults jobmanager  to localhost:6123
>>
>> final  ExecutionEnvironment env = Environment.getExecutionEnvironment();
>>
>> which is same on spark.
>>
>> val spark =
>> SparkSession.builder.master(local[*]).appname("anapp").getOrCreate
>>
>> However if I wish to run the servers on a different physical computer.
>> Then in Spark I can do it this way using the spark URI in my IDE.
>>
>> Conf =
>> SparkConf().setMaster("spark://:").setAppName("anapp")
>>
>> Can you please tell me the equivalent change to make so I can run my
>> servers and my IDE from different physical computers.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>


Re: FLINK JOB solved

2020-04-20 Thread Som Lima
I found the problem.

in the flink1.0.0/conf

There are two files.
Masters
and slaves

the Masters contains localhost:8081
in the slaves  just localhost.

I changed them both to server  ipaddress.

Now the FLINK JOB link has full :8081 link and displays Apache Flink
Dashboard in browser.




On Mon, 20 Apr 2020, 12:07 Som Lima,  wrote:

> Yes exactly that is the change I am having to make.  Changing FLINK JOB
> default localhost to ip of server computer in the browser.
>
> I followed the instructions as per your
> link.
>
> https://medium.com/@zjffdu/flink-on-zeppelin-part-1-get-started-2591aaa6aa47
>
> i.e. 0.0.0.0  of zeppelin.server.addr. for remote access.
>
>
>
> On Mon, 20 Apr 2020, 10:30 Jeff Zhang,  wrote:
>
>> I see, so you are running flink interpreter in local mode. But you access
>> zeppelin from a remote machine, right ?  Do you mean you can access it
>> after changing localhost to ip ? If so, then I can add one configuration in
>> zeppelin side to replace the localhost to real ip.
>>
>> Som Lima  于2020年4月20日周一 下午4:44写道:
>>
>>> I am only running the zeppelin  word count example by clicking the
>>> zeppelin run arrow.
>>>
>>>
>>> On Mon, 20 Apr 2020, 09:42 Jeff Zhang,  wrote:
>>>
>>>> How do you run flink job ? It should not always be localhost:8081
>>>>
>>>> Som Lima  于2020年4月20日周一 下午4:33写道:
>>>>
>>>>> Hi,
>>>>>
>>>>> FLINK JOB  url  defaults to localhost
>>>>>
>>>>> i.e. localhost:8081.
>>>>>
>>>>> I have to manually change it to server :8081 to get Apache  flink
>>>>> Web Dashboard to display.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> Best Regards
>>>>
>>>> Jeff Zhang
>>>>
>>>
>>
>> --
>> Best Regards
>>
>> Jeff Zhang
>>
>


Re: FLINK JOB

2020-04-20 Thread Som Lima
Yes exactly that is the change I am having to make.  Changing FLINK JOB
default localhost to ip of server computer in the browser.

I followed the instructions as per your
link.
https://medium.com/@zjffdu/flink-on-zeppelin-part-1-get-started-2591aaa6aa47

i.e. 0.0.0.0  of zeppelin.server.addr. for remote access.



On Mon, 20 Apr 2020, 10:30 Jeff Zhang,  wrote:

> I see, so you are running flink interpreter in local mode. But you access
> zeppelin from a remote machine, right ?  Do you mean you can access it
> after changing localhost to ip ? If so, then I can add one configuration in
> zeppelin side to replace the localhost to real ip.
>
> Som Lima  于2020年4月20日周一 下午4:44写道:
>
>> I am only running the zeppelin  word count example by clicking the
>> zeppelin run arrow.
>>
>>
>> On Mon, 20 Apr 2020, 09:42 Jeff Zhang,  wrote:
>>
>>> How do you run flink job ? It should not always be localhost:8081
>>>
>>> Som Lima  于2020年4月20日周一 下午4:33写道:
>>>
>>>> Hi,
>>>>
>>>> FLINK JOB  url  defaults to localhost
>>>>
>>>> i.e. localhost:8081.
>>>>
>>>> I have to manually change it to server :8081 to get Apache  flink
>>>> Web Dashboard to display.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>> --
>>> Best Regards
>>>
>>> Jeff Zhang
>>>
>>
>
> --
> Best Regards
>
> Jeff Zhang
>


Re: FLINK JOB

2020-04-20 Thread Som Lima
I am only running the zeppelin  word count example by clicking the zeppelin
run arrow.


On Mon, 20 Apr 2020, 09:42 Jeff Zhang,  wrote:

> How do you run flink job ? It should not always be localhost:8081
>
> Som Lima  于2020年4月20日周一 下午4:33写道:
>
>> Hi,
>>
>> FLINK JOB  url  defaults to localhost
>>
>> i.e. localhost:8081.
>>
>> I have to manually change it to server :8081 to get Apache  flink
>> Web Dashboard to display.
>>
>>
>>
>>
>>
>
> --
> Best Regards
>
> Jeff Zhang
>


FLINK JOB

2020-04-20 Thread Som Lima
Hi,

FLINK JOB  url  defaults to localhost

i.e. localhost:8081.

I have to manually change it to server :8081 to get Apache  flink  Web
Dashboard to display.


Re: Job manager URI rpc address:port

2020-04-19 Thread Som Lima
I will thanks.  Once I had it set up and working.
I switched  my computers around from client to server to server to client.
With your excellent instructions I was able to do it in 5 .minutes

On Mon, 20 Apr 2020, 00:05 Jeff Zhang,  wrote:

> Som, Let us know when you have any problems
>
> Som Lima  于2020年4月20日周一 上午2:31写道:
>
>> Thanks for the info and links.
>>
>> I had a lot of problems I am not sure what I was doing wrong.
>>
>> May be conflicts with setup from apache spark.  I think I may need to
>> setup users for each development.
>>
>>
>> Anyway I kept doing fresh installs about four altogether I think.
>>
>> Everything works fine now
>> Including remote access  of zeppelin on machines across the local area
>> network.
>>
>> Next step  setup remote clusters
>>  Wish me luck !
>>
>>
>>
>>
>>
>>
>>
>> On Sun, 19 Apr 2020, 14:58 Jeff Zhang,  wrote:
>>
>>> Hi Som,
>>>
>>> You can take a look at flink on zeppelin, in zeppelin you can connect to
>>> a remote flink cluster via a few configuration, and you don't need to worry
>>> about the jars. Flink interpreter will ship necessary jars for you. Here's
>>> a list of tutorials.
>>>
>>> 1) Get started https://link.medium.com/oppqD6dIg5
>>> <https://t.co/PTouUYYTrv?amp=1> 2) Batch https://
>>> link.medium.com/3qumbwRIg5 <https://t.co/Yo9QAY0Joj?amp=1> 3) Streaming
>>> https://link.medium.com/RBHa2lTIg5 <https://t.co/sUapN40tvI?amp=1> 4)
>>> Advanced usage https://link.medium.com/CAekyoXIg5
>>> <https://t.co/MXolULmafZ?amp=1>
>>>
>>>
>>> Zahid Rahman  于2020年4月19日周日 下午7:27写道:
>>>
>>>> Hi Tison,
>>>>
>>>> I think I may have found what I want in example 22.
>>>>
>>>> https://www.programcreek.com/java-api-examples/?api=org.apache.flink.configuration.Configuration
>>>>
>>>> I need to create Configuration object first as shown .
>>>>
>>>> Also I think  flink-conf.yaml file may contain configuration for client
>>>> rather than  server. So before starting is irrelevant.
>>>> I am going to play around and see but if the Configuration class allows
>>>> me to set configuration programmatically and overrides the yaml file then
>>>> that would be great.
>>>>
>>>>
>>>>
>>>> On Sun, 19 Apr 2020, 11:35 Som Lima,  wrote:
>>>>
>>>>> Thanks.
>>>>> flink-conf.yaml does allow me to do what I need to do without making
>>>>> any changes to client source code.
>>>>>
>>>>> But
>>>>> RemoteStreamEnvironment constructor  expects a jar file as the third
>>>>> parameter also.
>>>>>
>>>>> RemoteStreamEnvironment
>>>>> <https://ci.apache.org/projects/flink/flink-docs-release-1.7/api/java/org/apache/flink/streaming/api/environment/RemoteStreamEnvironment.html#RemoteStreamEnvironment-java.lang.String-int-java.lang.String...->
>>>>> (String
>>>>> <http://docs.oracle.com/javase/7/docs/api/java/lang/String.html?is-external=true>
>>>>>  host,
>>>>> int port, String
>>>>> <http://docs.oracle.com/javase/7/docs/api/java/lang/String.html?is-external=true>
>>>>> ... jarFiles)
>>>>> Creates a new RemoteStreamEnvironment that points to the master
>>>>> (JobManager) described by the given host name and port.
>>>>>
>>>>> On Sun, 19 Apr 2020, 11:02 tison,  wrote:
>>>>>
>>>>>> You can change flink-conf.yaml "jobmanager.address" or
>>>>>> "jobmanager.port" options before run the program or take a look at
>>>>>> RemoteStreamEnvironment which enables configuring host and port.
>>>>>>
>>>>>> Best,
>>>>>> tison.
>>>>>>
>>>>>>
>>>>>> Som Lima  于2020年4月19日周日 下午5:58写道:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> After running
>>>>>>>
>>>>>>> $ ./bin/start-cluster.sh
>>>>>>>
>>>>>>> The following line of code defaults jobmanager  to localhost:6123
>>>>>>>
>>>>>>> final  ExecutionEnvironment env =
>>>>>>> Environment.getExecutionEnvironment();
>>>>>>>
>>>>>>> which is same on spark.
>>>>>>>
>>>>>>> val spark =
>>>>>>> SparkSession.builder.master(local[*]).appname("anapp").getOrCreate
>>>>>>>
>>>>>>> However if I wish to run the servers on a different physical
>>>>>>> computer.
>>>>>>> Then in Spark I can do it this way using the spark URI in my IDE.
>>>>>>>
>>>>>>> Conf =
>>>>>>> SparkConf().setMaster("spark://:").setAppName("anapp")
>>>>>>>
>>>>>>> Can you please tell me the equivalent change to make so I can run my
>>>>>>> servers and my IDE from different physical computers.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>
>>> --
>>> Best Regards
>>>
>>> Jeff Zhang
>>>
>>
>
> --
> Best Regards
>
> Jeff Zhang
>


Re: Job manager URI rpc address:port

2020-04-19 Thread Som Lima
Thanks for the info and links.

I had a lot of problems I am not sure what I was doing wrong.

May be conflicts with setup from apache spark.  I think I may need to setup
users for each development.


Anyway I kept doing fresh installs about four altogether I think.

Everything works fine now
Including remote access  of zeppelin on machines across the local area
network.

Next step  setup remote clusters
 Wish me luck !







On Sun, 19 Apr 2020, 14:58 Jeff Zhang,  wrote:

> Hi Som,
>
> You can take a look at flink on zeppelin, in zeppelin you can connect to a
> remote flink cluster via a few configuration, and you don't need to worry
> about the jars. Flink interpreter will ship necessary jars for you. Here's
> a list of tutorials.
>
> 1) Get started https://link.medium.com/oppqD6dIg5
> <https://t.co/PTouUYYTrv?amp=1> 2) Batch https://
> link.medium.com/3qumbwRIg5 <https://t.co/Yo9QAY0Joj?amp=1> 3) Streaming
> https://link.medium.com/RBHa2lTIg5 <https://t.co/sUapN40tvI?amp=1> 4)
> Advanced usage https://link.medium.com/CAekyoXIg5
> <https://t.co/MXolULmafZ?amp=1>
>
>
> Zahid Rahman  于2020年4月19日周日 下午7:27写道:
>
>> Hi Tison,
>>
>> I think I may have found what I want in example 22.
>>
>> https://www.programcreek.com/java-api-examples/?api=org.apache.flink.configuration.Configuration
>>
>> I need to create Configuration object first as shown .
>>
>> Also I think  flink-conf.yaml file may contain configuration for client
>> rather than  server. So before starting is irrelevant.
>> I am going to play around and see but if the Configuration class allows
>> me to set configuration programmatically and overrides the yaml file then
>> that would be great.
>>
>>
>>
>> On Sun, 19 Apr 2020, 11:35 Som Lima,  wrote:
>>
>>> Thanks.
>>> flink-conf.yaml does allow me to do what I need to do without making any
>>> changes to client source code.
>>>
>>> But
>>> RemoteStreamEnvironment constructor  expects a jar file as the third
>>> parameter also.
>>>
>>> RemoteStreamEnvironment
>>> <https://ci.apache.org/projects/flink/flink-docs-release-1.7/api/java/org/apache/flink/streaming/api/environment/RemoteStreamEnvironment.html#RemoteStreamEnvironment-java.lang.String-int-java.lang.String...->
>>> (String
>>> <http://docs.oracle.com/javase/7/docs/api/java/lang/String.html?is-external=true>
>>>  host,
>>> int port, String
>>> <http://docs.oracle.com/javase/7/docs/api/java/lang/String.html?is-external=true>
>>> ... jarFiles)
>>> Creates a new RemoteStreamEnvironment that points to the master
>>> (JobManager) described by the given host name and port.
>>>
>>> On Sun, 19 Apr 2020, 11:02 tison,  wrote:
>>>
>>>> You can change flink-conf.yaml "jobmanager.address" or
>>>> "jobmanager.port" options before run the program or take a look at
>>>> RemoteStreamEnvironment which enables configuring host and port.
>>>>
>>>> Best,
>>>> tison.
>>>>
>>>>
>>>> Som Lima  于2020年4月19日周日 下午5:58写道:
>>>>
>>>>> Hi,
>>>>>
>>>>> After running
>>>>>
>>>>> $ ./bin/start-cluster.sh
>>>>>
>>>>> The following line of code defaults jobmanager  to localhost:6123
>>>>>
>>>>> final  ExecutionEnvironment env =
>>>>> Environment.getExecutionEnvironment();
>>>>>
>>>>> which is same on spark.
>>>>>
>>>>> val spark =
>>>>> SparkSession.builder.master(local[*]).appname("anapp").getOrCreate
>>>>>
>>>>> However if I wish to run the servers on a different physical computer.
>>>>> Then in Spark I can do it this way using the spark URI in my IDE.
>>>>>
>>>>> Conf =
>>>>> SparkConf().setMaster("spark://:").setAppName("anapp")
>>>>>
>>>>> Can you please tell me the equivalent change to make so I can run my
>>>>> servers and my IDE from different physical computers.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>
> --
> Best Regards
>
> Jeff Zhang
>


Re: Job manager URI rpc address:port

2020-04-19 Thread Som Lima
Thanks.
flink-conf.yaml does allow me to do what I need to do without making any
changes to client source code.

But
RemoteStreamEnvironment constructor  expects a jar file as the third
parameter also.

RemoteStreamEnvironment
<https://ci.apache.org/projects/flink/flink-docs-release-1.7/api/java/org/apache/flink/streaming/api/environment/RemoteStreamEnvironment.html#RemoteStreamEnvironment-java.lang.String-int-java.lang.String...->
(String
<http://docs.oracle.com/javase/7/docs/api/java/lang/String.html?is-external=true>
host,
int port, String
<http://docs.oracle.com/javase/7/docs/api/java/lang/String.html?is-external=true>
... jarFiles)
Creates a new RemoteStreamEnvironment that points to the master
(JobManager) described by the given host name and port.

On Sun, 19 Apr 2020, 11:02 tison,  wrote:

> You can change flink-conf.yaml "jobmanager.address" or "jobmanager.port"
> options before run the program or take a look at RemoteStreamEnvironment
> which enables configuring host and port.
>
> Best,
> tison.
>
>
> Som Lima  于2020年4月19日周日 下午5:58写道:
>
>> Hi,
>>
>> After running
>>
>> $ ./bin/start-cluster.sh
>>
>> The following line of code defaults jobmanager  to localhost:6123
>>
>> final  ExecutionEnvironment env = Environment.getExecutionEnvironment();
>>
>> which is same on spark.
>>
>> val spark =
>> SparkSession.builder.master(local[*]).appname("anapp").getOrCreate
>>
>> However if I wish to run the servers on a different physical computer.
>> Then in Spark I can do it this way using the spark URI in my IDE.
>>
>> Conf =
>> SparkConf().setMaster("spark://:").setAppName("anapp")
>>
>> Can you please tell me the equivalent change to make so I can run my
>> servers and my IDE from different physical computers.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>


Job manager URI rpc address:port

2020-04-19 Thread Som Lima
Hi,

After running

$ ./bin/start-cluster.sh

The following line of code defaults jobmanager  to localhost:6123

final  ExecutionEnvironment env = Environment.getExecutionEnvironment();

which is same on spark.

val spark =
SparkSession.builder.master(local[*]).appname("anapp").getOrCreate

However if I wish to run the servers on a different physical computer.
Then in Spark I can do it this way using the spark URI in my IDE.

Conf =  SparkConf().setMaster("spark://:").setAppName("anapp")

Can you please tell me the equivalent change to make so I can run my
servers and my IDE from different physical computers.