Re: off-heap size feature request

2016-03-16 Thread Fabian Hueske
Hi Ovidiu,

the parameters to configure the amount of managed memory
(taskmanager.memory.size,
taskmanager.memory.fraction) are valid for on and off-heap memory.

Have you tried these parameters and didn't they work as expected?

Best, Fabian


2016-03-16 11:43 GMT+01:00 Ovidiu-Cristian MARCU <
ovidiu-cristian.ma...@inria.fr>:

> Hi,
>
> Is it possible to add a parameter off-heap.size for the task manager
> off-heap memory [1]?
>
> It is not possible to limit the off-heap memory size, at least I found
> nothing in the documentation.
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-release-1.0/setup/config.html#managed-memory
>
> Best,
> Ovidiu
>


off-heap size feature request

2016-03-16 Thread Ovidiu-Cristian MARCU
Hi,

Is it possible to add a parameter off-heap.size for the task manager off-heap 
memory [1]?

It is not possible to limit the off-heap memory size, at least I found nothing 
in the documentation.

[1] 
https://ci.apache.org/projects/flink/flink-docs-release-1.0/setup/config.html#managed-memory
 


Best,
Ovidiu

Re: realtion between operator and task

2016-03-16 Thread Till Rohrmann
Hi Radu,

the mapping which StreamOperator is executed by which StreamTask happens
first in the StreamGraph.addOperator method. However, there is a second
step in the StreamingJobGraphGenerator.createChain where chainable
operators are chained and then executed by a single StreamTask. The
construction of the actual operator chain happens in the class OperatorChain
.

Cheers,
Till
​

On Tue, Mar 8, 2016 at 10:51 AM, Radu Tudoran 
wrote:

> Hi,
>
>
>
> Thanks for the answer. Can you point me to the code where the operators
> are being assign to tasks.
>
> Thanks
>
>
>
> *From:* ewenstep...@gmail.com [mailto:ewenstep...@gmail.com] *On Behalf
> Of *Stephan Ewen
> *Sent:* Monday, March 07, 2016 8:29 PM
> *To:* user@flink.apache.org
> *Subject:* Re: realtion between operator and task
>
>
>
> Hi!
>
>
>
> A "task" is something that is deployed as one unit to the TaskManager and
> runs in one thread.
>
>
>
> A task can have multiple "operators" chained together, usually one per
> user function, for example "Map -> Filter -> FlatMap -> AssignTimestamps ->
> ..."
>
>
>
> Stephan
>
>
>
>
>
> On Mon, Mar 7, 2016 at 7:36 PM, Radu Tudoran 
> wrote:
>
> Hi,
>
>
>
> Can someone explain how and where a stream operator is mapped to a stream
> task.
>
> I am particularly interested in the way the stream outputs are created and
> attached to the operators. I saw that this happen in OperatorChain
> functions but I do not have the picture of the lifecycle of an stream
> operator that you would create to its mapping to the task and assignment of
> the output binding.
>
>
>


Re: Error when accessing secure HDFS with standalone Flink

2016-03-16 Thread Stefano Baghino
Hi Max,

thanks for the tips. What we did has been running kinit on each node with
the same user that then went on running the start-cluster.sh script. Right
now the LDAP groups are backed by the OS ones and the user that ran the
launch script is part of the flink group, that is on every node of the
cluster and has full access to the flink directory (which is placed under
the same path on every node).

Would have this been enough to kerberize Flink?

Also: once a user runs Flink in secure mode, is every deployed job run as
the user that ran the start-cluster.sh script (same behavior as running a
YARN session)? Or users can kinit on each node and then submit jobs that
will be individually run with their credentials?

Thanks again.

On Wed, Mar 16, 2016 at 10:30 AM, Maximilian Michels  wrote:

> Hi Stefano,
>
> You have probably seen
>
> https://ci.apache.org/projects/flink/flink-docs-release-1.0/setup/config.html#kerberos
> ?
>
> Currently, all nodes need to be authenticated with the Kerberos before
> Flink is started (not just the JobManager). Could it be that the
> start-cluster.sh script actually is not authenticated using Kerberos
> at the nodes it sshs to when it starts the TaskManagers?
>
> Best,
> Max
>
>
> On Fri, Mar 11, 2016 at 8:17 AM, Stefano Baghino
>  wrote:
> > Hello everybody,
> >
> > me and my colleagues have been running some tests on Flink 1.0.0 in a
> secure
> > environment (Kerberos). Yesterday we did several tests on the standalone
> > Flink deployment but couldn't get it to access HDFS. Judging from the
> error
> > it looks like Flink is not trying to authenticate itself with Kerberos.
> The
> > root cause of the error is
> > "org.apache.hadoop.security.AccessControlException: SIMPLE
> authentication is
> > not enabled.  Available:[TOKEN, KERBEROS]". I've put the whole logs in
> this
> > gist. I've went through the source code and judging from what I saw this
> > error is emitted by Hadoop if a client is not using any authentication
> > method on a secure cluster. Also, in the source code of Flink, it looks
> like
> > when running a job on a secure cluster a log message (at INFO level)
> should
> > be printed stating the fact.
> >
> > To go through the steps I followed to setup the environment: I've built
> > Flink and put it in the same folder under the two nodes of the cluster,
> > adjusted the configs, assigned its ownership (and write permissions) to a
> > group, than I ran kinit with a user belonging to that group on both the
> > nodes and finally I ran start-cluster.sh and deployed the job. I tried
> both
> > running the job as the same user who ran the start-cluster.sh script and
> > another one (still authenticated with Kerberos on both nodes).
> >
> > The core-site.xml correctly states that the authentication method is
> > kerberos and using the hdfs CLI everything runs as expected. Thinking it
> > could be an error tied to permissions on the core-site.xml file I also
> added
> > the user running the start-cluster.sh script to the hadoop group, which
> > owned the file, yield the same results, unfortunately.
> >
> > Can you help me troubleshoot this issue? Thank you so much in advance!
> >
> > --
> > BR,
> > Stefano Baghino
> >
> > Software Engineer @ Radicalbit
>



-- 
BR,
Stefano Baghino

Software Engineer @ Radicalbit


Re: Error when accessing secure HDFS with standalone Flink

2016-03-16 Thread Maximilian Michels
Hi Stefano,

You have probably seen
https://ci.apache.org/projects/flink/flink-docs-release-1.0/setup/config.html#kerberos
?

Currently, all nodes need to be authenticated with the Kerberos before
Flink is started (not just the JobManager). Could it be that the
start-cluster.sh script actually is not authenticated using Kerberos
at the nodes it sshs to when it starts the TaskManagers?

Best,
Max


On Fri, Mar 11, 2016 at 8:17 AM, Stefano Baghino
 wrote:
> Hello everybody,
>
> me and my colleagues have been running some tests on Flink 1.0.0 in a secure
> environment (Kerberos). Yesterday we did several tests on the standalone
> Flink deployment but couldn't get it to access HDFS. Judging from the error
> it looks like Flink is not trying to authenticate itself with Kerberos. The
> root cause of the error is
> "org.apache.hadoop.security.AccessControlException: SIMPLE authentication is
> not enabled.  Available:[TOKEN, KERBEROS]". I've put the whole logs in this
> gist. I've went through the source code and judging from what I saw this
> error is emitted by Hadoop if a client is not using any authentication
> method on a secure cluster. Also, in the source code of Flink, it looks like
> when running a job on a secure cluster a log message (at INFO level) should
> be printed stating the fact.
>
> To go through the steps I followed to setup the environment: I've built
> Flink and put it in the same folder under the two nodes of the cluster,
> adjusted the configs, assigned its ownership (and write permissions) to a
> group, than I ran kinit with a user belonging to that group on both the
> nodes and finally I ran start-cluster.sh and deployed the job. I tried both
> running the job as the same user who ran the start-cluster.sh script and
> another one (still authenticated with Kerberos on both nodes).
>
> The core-site.xml correctly states that the authentication method is
> kerberos and using the hdfs CLI everything runs as expected. Thinking it
> could be an error tied to permissions on the core-site.xml file I also added
> the user running the start-cluster.sh script to the hadoop group, which
> owned the file, yield the same results, unfortunately.
>
> Can you help me troubleshoot this issue? Thank you so much in advance!
>
> --
> BR,
> Stefano Baghino
>
> Software Engineer @ Radicalbit


Re: JDBCInputFormat preparation with Flink 1.1-SNAPSHOT and Scala 2.11

2016-03-16 Thread Robert Metzger
Sorry for joining this discussion late. Maybe this is also interesting for
you:
http://www.confluent.io/blog/bottled-water-real-time-integration-of-postgresql-and-kafka/


On Wed, Mar 9, 2016 at 1:47 PM, Prez Cannady 
wrote:

> Thanks.  Need to dive in a bit better, but I did clarify some things in my
> mind which bear mentioning.
>
> 1. Sourcing JDBC data is not a streaming operation, but a batching one.
> Which makes sense, since you generally slurp rather than stream relational
> data, so within the constraints provided you’ll be operating on whole
> result sets.
> 2. Kafka is useful for mating batch processes (like slurping a database)
> with stream ones (reading out the results of a database query then
> distributed to various processing nodes).
>
> Prez Cannady
> p: 617 500 3378
> e: revp...@opencorrelate.org
> GH: https://github.com/opencorrelate
> LI: https://www.linkedin.com/in/revprez
>
>
>
>
>
>
>
>
>
> On Mar 9, 2016, at 6:46 AM, Prez Cannady 
> wrote:
>
> I suspected as much (the tuple size limitation).  Creating my own
> InputFormat seems to be the best solution, but before i go down that rabbit
> hole I wanted to see at least a semi-trivial working example of
> JDBCInputFormat with Scala 2.11.
>
> I’d appreciate a look at that prototype if its publicly available (even if
> it is Java). I might glean a hint from it.
>
> Prez Cannady
> p: 617 500 3378
> e: revp...@opencorrelate.org
> GH: https://github.com/opencorrelate
> LI: https://www.linkedin.com/in/revprez
>
> On Mar 9, 2016, at 3:25 AM, Chesnay Schepler  wrote:
>
> you can always create your own InputFormat, but there is no
> AbstractJDBCInputFormat if that's what you were looking for.
>
> When you say arbitrary tuple size, do you mean a) a size greater than 25,
> or b) tuples of different sizes?
> If a) unless you are fine with using nested tuples you won't get around
> the tuple size limitation. Since the user has to be aware of the nesting
> (since the fields can be accessed directly via tuple.f0 etc), this can't
> really be done in a general-purpose fashion.
> If b) this will straight-up not work with tuples.
>
> You could use POJO's though. then you could also group by column names.
>
> I'm not sure about Scala, but in the Java Stream API you can pass the
> InputFormat and the TypeInformation into createInput.
>
> I've recently did a prototype where the input type is determined
> automatically by querying the database. If this is a problem for you feel
> free to ping me.
>
> On 09.03.2016 03:17, Prez Cannady wrote:
>
> I’m attempting to create a stream using JDBCInputFormat.  Objective is to
> convert each record into a tuple and then serialize for input into a Kafka
> topic.  Here’s what I have so far.
>
> ```
> val env = StreamExecutionEnvironment.getExecutionEnvironment
>
> val inputFormat = JDBCInputFormat.buildJDBCInputFormat()
>   .setDrivername("org.postgresql.Driver")
>   .setDBUrl("jdbc:postgresql:test")
>   .setQuery("select name from persons")
>   .finish()
>
> val stream : DataStream[Tuple1[String]] = env.createInput(...)
> ```
>
> I think this is essentially what I want to do.  It would be nice if I
> could return tuples of arbitrary length, but reading the code suggests I
> have to commit to a defined arity.  So I have some questions.
>
> 1. Is there a better way to read from a database (i.e., defining my own
> `InputFormat` using Slick)?
> 2. To get the above example working, what should I supply to `createInput`?
>
>
> Prez Cannady
> p: 617 500 3378
> e:  revp...@opencorrelate.org
> GH:  https://github.com/opencorrelate
> LI:  
> https://www.linkedin.com/in/revprez
>
>
>
>
>
>
>
>
>
>
>
>
>


Re: Using a POJO class wrapping an ArrayList

2016-03-16 Thread Fabian Hueske
Hi Mengqi,

I did not completely understand your use case.

If you would like to use a composite key (a key with multiple fields) there
are two alternatives:
- use a tuple as key type. This only works if all records have the same
number of key fields. Tuple serialization and comparisons are very
efficient.
- implement your own type (like you did) but override the compare() method.
Here you can also deal with different number of key fields. A custom type
will be serialized with Kryo. You can make it more efficient if you
register the type in the ExecutionConfig.

If you want to use multiple keys for a record, you need to emit the record
multiple times in a FlatMapFunction and assign each time a different key.

Hope this helps,
Fabian


2016-03-14 10:43 GMT+01:00 Mengqi Yang :

> Hi all,
>
> Now I am building a POJO class for key selectors. Here is the class:
>
> public class Ids implements Comparable, Serializable{
>
> private static final long serialVersionUID = 1L;
>
> private ArrayList ids = new ArrayList();
>
> Ids() {}
>
> Ids(ArrayList inputIds) {
> this.ids = inputIds;
> }
>
> }
>
> And the question is, how could I use each element in the array list as a
> field for further key selection? I saw the typeinfo of Flink takes the
> field
> of arraylist as 1. Or maybe I didn't understand it correctly.
>
> Thanks,
> Mengqi
>
>
>
> --
> View this message in context:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Using-a-POJO-class-wrapping-an-ArrayList-tp5483.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive
> at Nabble.com.
>


Re: Memory ran out PageRank

2016-03-16 Thread Fabian Hueske
Hi Ovidiu,

putting the CompactingHashTable aside, all data structures and algorithms
that use managed memory can spill to disk if data exceeds memory capacity.

It was a conscious choice to not let the CompactingHashTable spill. Once
the solution set hash table is spilled, (parts of) the hash table needs to
be read and written in each iteration. This would have a very significant
impact on the performance. So far the guideline was to add more machines if
you run out of memory in a delta iteration to keep computation in-memory.

Best, Fabian

2016-03-16 8:14 GMT+01:00 Ovidiu-Cristian MARCU <
ovidiu-cristian.ma...@inria.fr>:

> Hi,
>
> Regarding the solution set going out of memory, I would like an issue to
> be filled against it.
>
> Looking into code for CompactingHashTable I see
>
> The hash table is internally divided into two parts: The hash index, and
> the partition buffers that store the actual records. When records are
> inserted or updated, the hash table appends the records to its
> corresponding partition, and inserts or updates the entry in the hash
> index. In the case that the hash table runs out of memory, it compacts a
> partition by walking through the hash index and copying all reachable
> elements into a fresh partition. After that, it releases the memory of the
> partition to compact.
>
> It is not clear the expected behaviour when the hash table runs out of
> memory.
>
> If by contrast Spark is working on RDDs and they can be cached in memory
> or spilled to disk, something similar could be done for all the components
> currently built in memory and not being spilled to disk to avoid
> OutOfMemory.
> What do you think?
>
> Best,
> Ovidiu
>
> On 14 Mar 2016, at 18:48, Ufuk Celebi  wrote:
>
> Probably the limitation is that the number of keys is different in the
> real and the synthetic data set respectively. Can you confirm this?
>
> The solution set for delta iterations is currently implemented as an
> in-memory hash table that works on managed memory segments, but is not
> spillable.
>
> – Ufuk
>
> On Mon, Mar 14, 2016 at 6:30 PM, Ovidiu-Cristian MARCU
>  wrote:
>
>
> This problem is surprising as I was able to run PR and CC on a larger
> graph (2bil edges) but with this synthetic graph (1bil edges groups of 10)
> I ran out of memory; regarding configuration (memory and parallelism, other
> internals) I was using the same.
> There is some limitation somewhere I will try to understand more what is
> happening.
>
> Best,
> Ovidiu
>
> On 14 Mar 2016, at 18:06, Martin Junghanns 
> wrote:
>
> Hi,
>
> I understand the confusion. So far, I did not run into the problem, but I
> think this needs to be adressed as all our graph processing abstractions
> are implemented on top of the delta iteration.
>
> According to the previous mailing list discussion, the problem is with the
> solution set and its missing ability to spill.
>
> If this is the still the case, we should open an issue for that. Any
> further opinions on that?
>
> Cheers,
> Martin
>
>
> On 14.03.2016 17:55, Ovidiu-Cristian MARCU wrote:
>
> Thank you for this alternative.
> I don’t understand how the workaround will fix this on systems with
> limited memory and maybe larger graph.
>
> Running Connected Components on the same graph gives the same problem.
>
> IterationHead(Unnamed Delta Iteration)(82/88) switched to FAILED
> java.lang.RuntimeException: Memory ran out. Compaction failed.
> numPartitions: 32 minPartition: 31 maxPartition: 32 number of overflow
> segments: 417 bucketSize: 827 Overall memory: 149159936 Partition memory:
> 65601536 Message: Index: 32, Size: 31
>at
> org.apache.flink.runtime.operators.hash.CompactingHashTable.insertRecordIntoPartition(CompactingHashTable.java:469)
>at
> org.apache.flink.runtime.operators.hash.CompactingHashTable.insertOrReplaceRecord(CompactingHashTable.java:414)
>at
> org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTableWithUniqueKey(CompactingHashTable.java:325)
>at
> org.apache.flink.runtime.iterative.task.IterationHeadTask.readInitialSolutionSet(IterationHeadTask.java:212)
>at
> org.apache.flink.runtime.iterative.task.IterationHeadTask.run(IterationHeadTask.java:273)
>at
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:354)
>at org.apache.flink.runtime.taskmanager.Task.run(Task.java:584)
>at java.lang.Thread.run(Thread.java:745)
>
> Best,
> Ovidiu
>
> On 14 Mar 2016, at 17:36, Martin Junghanns 
> wrote:
>
> Hi
>
> I think this is the same issue we had before on the list [1]. Stephan
> recommended the following workaround:
>
> A possible workaround is to use the option "setSolutionSetUnmanaged(true)"
> on the iteration. That will eliminate the fragmentation issue, at least.
>
>
> Unfortunately, you cannot set this when using graph.run(new PageRank(...))
>
> I created a Gist which shows 

Re: Memory ran out PageRank

2016-03-16 Thread Ovidiu-Cristian MARCU
Hi,

Regarding the solution set going out of memory, I would like an issue to be 
filled against it.

Looking into code for CompactingHashTable I see

The hash table is internally divided into two parts: The hash index, and the 
partition buffers that store the actual records. When records are inserted or 
updated, the hash table appends the records to its corresponding partition, and 
inserts or updates the entry in the hash index. In the case that the hash table 
runs out of memory, it compacts a partition by walking through the hash index 
and copying all reachable elements into a fresh partition. After that, it 
releases the memory of the partition to compact.

It is not clear the expected behaviour when the hash table runs out of memory.

If by contrast Spark is working on RDDs and they can be cached in memory or 
spilled to disk, something similar could be done for all the components 
currently built in memory and not being spilled to disk to avoid OutOfMemory.
What do you think?

Best,
Ovidiu

> On 14 Mar 2016, at 18:48, Ufuk Celebi  wrote:
> 
> Probably the limitation is that the number of keys is different in the
> real and the synthetic data set respectively. Can you confirm this?
> 
> The solution set for delta iterations is currently implemented as an
> in-memory hash table that works on managed memory segments, but is not
> spillable.
> 
> – Ufuk
> 
> On Mon, Mar 14, 2016 at 6:30 PM, Ovidiu-Cristian MARCU
>  wrote:
>> 
>> This problem is surprising as I was able to run PR and CC on a larger graph 
>> (2bil edges) but with this synthetic graph (1bil edges groups of 10) I ran 
>> out of memory; regarding configuration (memory and parallelism, other 
>> internals) I was using the same.
>> There is some limitation somewhere I will try to understand more what is 
>> happening.
>> 
>> Best,
>> Ovidiu
>> 
>>> On 14 Mar 2016, at 18:06, Martin Junghanns  wrote:
>>> 
>>> Hi,
>>> 
>>> I understand the confusion. So far, I did not run into the problem, but I 
>>> think this needs to be adressed as all our graph processing abstractions 
>>> are implemented on top of the delta iteration.
>>> 
>>> According to the previous mailing list discussion, the problem is with the 
>>> solution set and its missing ability to spill.
>>> 
>>> If this is the still the case, we should open an issue for that. Any 
>>> further opinions on that?
>>> 
>>> Cheers,
>>> Martin
>>> 
>>> 
>>> On 14.03.2016 17:55, Ovidiu-Cristian MARCU wrote:
 Thank you for this alternative.
 I don’t understand how the workaround will fix this on systems with 
 limited memory and maybe larger graph.
 
 Running Connected Components on the same graph gives the same problem.
 
 IterationHead(Unnamed Delta Iteration)(82/88) switched to FAILED
 java.lang.RuntimeException: Memory ran out. Compaction failed. 
 numPartitions: 32 minPartition: 31 maxPartition: 32 number of overflow 
 segments: 417 bucketSize: 827 Overall memory: 149159936 Partition memory: 
 65601536 Message: Index: 32, Size: 31
at 
 org.apache.flink.runtime.operators.hash.CompactingHashTable.insertRecordIntoPartition(CompactingHashTable.java:469)
at 
 org.apache.flink.runtime.operators.hash.CompactingHashTable.insertOrReplaceRecord(CompactingHashTable.java:414)
at 
 org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTableWithUniqueKey(CompactingHashTable.java:325)
at 
 org.apache.flink.runtime.iterative.task.IterationHeadTask.readInitialSolutionSet(IterationHeadTask.java:212)
at 
 org.apache.flink.runtime.iterative.task.IterationHeadTask.run(IterationHeadTask.java:273)
at 
 org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:354)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:584)
at java.lang.Thread.run(Thread.java:745)
 
 Best,
 Ovidiu
 
> On 14 Mar 2016, at 17:36, Martin Junghanns  
> wrote:
> 
> Hi
> 
> I think this is the same issue we had before on the list [1]. Stephan 
> recommended the following workaround:
> 
>> A possible workaround is to use the option 
>> "setSolutionSetUnmanaged(true)"
>> on the iteration. That will eliminate the fragmentation issue, at least.
> 
> Unfortunately, you cannot set this when using graph.run(new PageRank(...))
> 
> I created a Gist which shows you how to set this using PageRank
> 
> https://gist.github.com/s1ck/801a8ef97ce374b358df
> 
> Please let us know if it worked out for you.
> 
> Cheers,
> Martin
> 
> [1] 
> http://mail-archives.apache.org/mod_mbox/flink-user/201508.mbox/%3CCAELUF_ByPAB%2BPXWLemPzRH%3D-awATeSz4sGz4v9TmnvFku3%3Dx3A%40mail.gmail.com%3E
> 
> On 14.03.2016 16:55, Ovidiu-Cristian MARCU wrote:
>> Hi,