Glad I could help any way. :) When the address is set to “localhost” I cannot submit a job. It immediately fails. But the address is “127.0.0.1”, it is stuck a little whyle on DEPLOYING and the fails. Correct me if I’m wrong but I think since using the address, hardcoded in config file, won’t harm anything, it will be safer to use it rather than defining it in the code.
> On Mar 5, 2015, at 6:57 PM, Till Rohrmann <trohrm...@apache.org> wrote: > > Could you submit a job when you set the job manager address to "localhost"? > I did not see any logging statements of received jobs. If you did, could > you also send the logs of the client? > > The 0.0.0.0 to which the BlobServer binds works for me on my machine. I > cannot remember that we had problems with that before. But I agree, we > should set it to the network interface which the JobManager uses. > > I cannot explain why your fix solves the problem. It does not touch any of > the JobClient/JobManager logic. > > I updated my local branch [1] with a fix for the BlobServer. Could you try > it out again and send us the logs? Thanks a lot for your help Dulaj. > > On Thu, Mar 5, 2015 at 1:24 PM, Dulaj Viduranga <vidura...@icloud.com> > wrote: > >> But can you explain why did my fix solved it? >> >>> On Mar 5, 2015, at 5:50 PM, Stephan Ewen <se...@apache.org> wrote: >>> >>> Hi Dulaj! >>> >>> Okay, the logs give us some insight. Both setups seem to look good in >> terms >>> of TaskManager and JobManager startup. >>> >>> In one of the logs (127.0.0.1) you submit a job. The job fails because >> the >>> TaskManager cannot grab the JAR file from the JobManager. >>> I think the problem is that the BLOB server binds to 0.0.0.0 - it should >>> bind to the same address as the JobManager actor system. >>> >>> That should definitely be changed... >>> >>> On Thu, Mar 5, 2015 at 10:08 AM, Dulaj Viduranga <vidura...@icloud.com> >>> wrote: >>> >>>> Hi, >>>> This is the log with setting “localhost” >>>> flink-Vidura-jobmanager-localhost.log < >>>> >> https://gist.github.com/viduranga/e9d43521587697de3eb5#file-flink-vidura-jobmanager-localhost-log >>>>> >>>> >>>> And this is the log with setting “127.0.0.1” >>>> flink-Vidura-jobmanager-localhost.log < >>>> >> https://gist.github.com/viduranga/5af6b05f204e1f4b344f#file-flink-vidura-jobmanager-localhost-log >>>>> >>>> >>>>> On Mar 5, 2015, at 2:23 PM, Till Rohrmann <trohrm...@apache.org> >> wrote: >>>>> >>>>> What does the jobmanager log says? I think Stephan added some more >>>> logging >>>>> output which helps us to debug this problem. >>>>> >>>>> On Thu, Mar 5, 2015 at 9:36 AM, Dulaj Viduranga <vidura...@icloud.com> >>>>> wrote: >>>>> >>>>>> Using start-locat.sh. >>>>>> I’m using the original config yaml. I also tried changing jobmanager >>>>>> address in config to “127.0.0.1 but no luck. With my changes it works >>>> ok. >>>>>> The conf file follows. >>>>>> >>>>>> >>>>>> >>>> >> ################################################################################ >>>>>> # Licensed to the Apache Software Foundation (ASF) under one >>>>>> # or more contributor license agreements. See the NOTICE file >>>>>> # distributed with this work for additional information >>>>>> # regarding copyright ownership. The ASF licenses this file >>>>>> # to you under the Apache License, Version 2.0 (the >>>>>> # "License"); you may not use this file except in compliance >>>>>> # with the License. You may obtain a copy of the License at >>>>>> # >>>>>> # http://www.apache.org/licenses/LICENSE-2.0 >>>>>> # >>>>>> # Unless required by applicable law or agreed to in writing, software >>>>>> # distributed under the License is distributed on an "AS IS" BASIS, >>>>>> # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or >>>> implied. >>>>>> # See the License for the specific language governing permissions and >>>>>> # limitations under the License. >>>>>> >>>>>> >>>> >> ################################################################################ >>>>>> >>>>>> >>>>>> >>>>>> >>>> >> #============================================================================== >>>>>> # Common >>>>>> >>>>>> >>>> >> #============================================================================== >>>>>> >>>>>> jobmanager.rpc.address: 127.0.0.1 >>>>>> >>>>>> jobmanager.rpc.port: 6123 >>>>>> >>>>>> jobmanager.heap.mb: 256 >>>>>> >>>>>> taskmanager.heap.mb: 512 >>>>>> >>>>>> taskmanager.numberOfTaskSlots: 1 >>>>>> >>>>>> parallelization.degree.default: 1 >>>>>> >>>>>> >>>>>> >>>> >> #============================================================================== >>>>>> # Web Frontend >>>>>> >>>>>> >>>> >> #============================================================================== >>>>>> >>>>>> # The port under which the web-based runtime monitor listens. >>>>>> # A value of -1 deactivates the web server. >>>>>> >>>>>> jobmanager.web.port: 8081 >>>>>> >>>>>> # The port uder which the standalone web client >>>>>> # (for job upload and submit) listens. >>>>>> >>>>>> webclient.port: 8080 >>>>>> >>>>>> >>>>>> >>>> >> #============================================================================== >>>>>> # Advanced >>>>>> >>>>>> >>>> >> #============================================================================== >>>>>> >>>>>> # The number of buffers for the network stack. >>>>>> # >>>>>> # taskmanager.network.numberOfBuffers: 2048 >>>>>> >>>>>> # Directories for temporary files. >>>>>> # >>>>>> # Add a delimited list for multiple directories, using the system >>>> directory >>>>>> # delimiter (colon ':' on unix) or a comma, e.g.: >>>>>> # /data1/tmp:/data2/tmp:/data3/tmp >>>>>> # >>>>>> # Note: Each directory entry is read from and written to by a >> different >>>> I/O >>>>>> # thread. You can include the same directory multiple times in order >> to >>>>>> create >>>>>> # multiple I/O threads against that directory. This is for example >>>>>> relevant for >>>>>> # high-throughput RAIDs. >>>>>> # >>>>>> # If not specified, the system-specific Java temporary directory >>>>>> (java.io.tmpdir >>>>>> # property) is taken. >>>>>> # >>>>>> # taskmanager.tmp.dirs: /tmp >>>>>> >>>>>> # Path to the Hadoop configuration directory. >>>>>> # >>>>>> # This configuration is used when writing into HDFS. Unless specified >>>>>> otherwise, >>>>>> # HDFS file creation will use HDFS default settings with respect to >>>>>> block-size, >>>>>> # replication factor, etc. >>>>>> # >>>>>> # You can also directly specify the paths to hdfs-default.xml and >>>>>> hdfs-site.xml >>>>>> # via keys 'fs.hdfs.hdfsdefault' and 'fs.hdfs.hdfssite'. >>>>>> # >>>>>> # fs.hdfs.hadoopconf: /path/to/hadoop/conf/ >>>>>> >>>>>> >>>>>>> On Mar 5, 2015, at 2:03 PM, Till Rohrmann <trohrm...@apache.org> >>>> wrote: >>>>>>> >>>>>>> How did you start the flink cluster? Using the start-local.sh, the >>>>>>> start-cluster.sh or starting the job manager and task managers >>>>>> individually >>>>>>> using taskmanager.sh/jobmanager.sh. Could you maybe post the >>>>>>> flink-conf.yaml file, you're using? >>>>>>> >>>>>>> With your changes, everything works, right? >>>>>>> >>>>>>> On Thu, Mar 5, 2015 at 8:55 AM, Dulaj Viduranga < >> vidura...@icloud.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Till, >>>>>>>> I’m sorry. It doesn’t seem to solve the problem. The taskmanager >> still >>>>>>>> tries a 10.0.0.0/8 IP. >>>>>>>> >>>>>>>> Best regards. >>>>>>>> >>>>>>>>> On Mar 5, 2015, at 1:00 PM, Till Rohrmann <till.rohrm...@gmail.com >>> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Hi Dulaj, >>>>>>>>> >>>>>>>>> I looked through your commit and noticed that the JobClient might >> not >>>>>> be >>>>>>>>> listening on the right network interface. Your commit seems to fix >>>> it. >>>>>> I >>>>>>>>> just want to understand the problem properly and therefore I >> opened a >>>>>>>>> branch with a small change. Could you try out whether this change >>>> would >>>>>>>>> also fix your problem? You can find the code here [1]. Would be >>>> awesome >>>>>>>> if >>>>>>>>> you checked it out and let it run on your cluster setting. Thanks a >>>> lot >>>>>>>>> Dulaj! >>>>>>>>> >>>>>>>>> [1] >>>>>>>>> >>>>>>>> >>>>>> >>>> >> https://github.com/tillrohrmann/flink/tree/fixLocalFlinkMiniClusterJobClient >>>>>>>>> >>>>>>>>> On Thu, Mar 5, 2015 at 4:21 AM, Dulaj Viduranga < >>>> vidura...@icloud.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> The every change in the commit b7da22a is not required but I >> thought >>>>>>>> they >>>>>>>>>> are appropriate. >>>>>>>>>> >>>>>>>>>>> On Mar 5, 2015, at 8:11 AM, Dulaj Viduranga < >> vidura...@icloud.com> >>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> I found many other places “localhost” is hard coded. I changed >> them >>>>>> in >>>>>>>> a >>>>>>>>>> better way I think. I made a pull request. Please review. b7da22a >> < >>>>>>>>>> >>>>>>>> >>>>>> >>>> >> https://github.com/viduranga/flink/commit/b7da22a562d3da5a9be2657308c0f82e4e2f80cd >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> On Mar 4, 2015, at 8:17 PM, Stephan Ewen <se...@apache.org> >>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> If I recall correctly, we only hardcode "localhost" in the local >>>>>> mini >>>>>>>>>>>> cluster - do you think it is problematic there as well? >>>>>>>>>>>> >>>>>>>>>>>> Have you found any other places? >>>>>>>>>>>> >>>>>>>>>>>> On Mon, Mar 2, 2015 at 10:26 AM, Dulaj Viduranga < >>>>>>>> vidura...@icloud.com> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> In some places of the code, "localhost" is hard coded. When it >> is >>>>>>>>>> resolved >>>>>>>>>>>>> by the DNS, it is posible to be directed to a different IP >> other >>>>>>>> than >>>>>>>>>>>>> 127.0.0.1 (like private range 10.0.0.0/8). I changed those >>>> places >>>>>> to >>>>>>>>>>>>> 127.0.0.1 and it works like a charm. >>>>>>>>>>>>> But hard coding 127.0.0.1 is not a good option because when the >>>>>>>>>> jobmanager >>>>>>>>>>>>> ip is changed, this becomes an issue again. I'm thinking of >>>> setting >>>>>>>>>>>>> jobmanager ip from the config.yaml to these places. >>>>>>>>>>>>> If you have a better idea on doing this with your experience, >>>>>> please >>>>>>>>>> let >>>>>>>>>>>>> me know. >>>>>>>>>>>>> >>>>>>>>>>>>> Best. >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>> >>>> >> >>