[jira] [Created] (FLINK-2230) Add Support for Null-Values in TupleSerializer

2015-06-15 Thread Shiti Saxena (JIRA)
Shiti Saxena created FLINK-2230: --- Summary: Add Support for Null-Values in TupleSerializer Key: FLINK-2230 URL: https://issues.apache.org/jira/browse/FLINK-2230 Project: Flink Issue Type: Bug

Re: Listing Apache-2.0 dependencies in LICENSE file

2015-06-15 Thread Henry Saputra
Thanks Till, that clears up the confusion I had =) On Mon, Jun 15, 2015 at 1:37 AM, Till Rohrmann wrote: > Hi Henry, > > there are actually two licensing questions and one update for the current > release going on but all of them are orthogonal and therefore I would like > to keep them separate.

Re: Listing Apache-2.0 dependencies in LICENSE file

2015-06-15 Thread Stephan Ewen
It is true, we need not list the dependencies under ASL2. I originally added them as a convenience list of bundles dependencies of the source release. I think it is nice to keep them, if not resulting in excessive overhead for maintenance. On Mon, Jun 15, 2015 at 7:22 PM, Ted Dunning wrote: > H

Re: Listing Apache-2.0 dependencies in LICENSE file

2015-06-15 Thread Ted Dunning
Here are some cogent comments from Marvin Humphrey. On Mon, Jun 15, 2015 at 6:04 PM, Marvin Humphrey wrote: > Hi Ted, > > The discussion seems to be about the convenience binary, not the official > source release, so ASF policy differs. The party who supplies a > convenience binary bears res

[jira] [Created] (FLINK-2229) Data sets involving non-primitive arrays cannot be unioned

2015-06-15 Thread Sebastian Kruse (JIRA)
Sebastian Kruse created FLINK-2229: -- Summary: Data sets involving non-primitive arrays cannot be unioned Key: FLINK-2229 URL: https://issues.apache.org/jira/browse/FLINK-2229 Project: Flink

Re: The null in Flink

2015-06-15 Thread Ted Dunning
On Mon, Jun 15, 2015 at 8:45 AM, Maximilian Michels wrote: > Just to give an idea what null values could cause in Flink: DataSet.count() > returns the number of elements of all values in a Dataset (null or not) > while #834 would ignore null values and aggregate the DataSet without them. > Compa

Re: The null in Flink

2015-06-15 Thread Ted Dunning
The example of SQL is obviously dominating thoughts of NULL, but I think that the example of R is probably better in terms of how things can work fairly well. NULL is a key concept and very helpful in a number of settings. With R's fairly simple functional nature it is easy to filter data and mos

[jira] [Created] (FLINK-2228) Web fronted uses two different timezones when reporting the time for job

2015-06-15 Thread Theodore Vasiloudis (JIRA)
Theodore Vasiloudis created FLINK-2228: -- Summary: Web fronted uses two different timezones when reporting the time for job Key: FLINK-2228 URL: https://issues.apache.org/jira/browse/FLINK-2228 Pr

The null in Flink

2015-06-15 Thread Maximilian Michels
Hi everyone, I'm seeing a lot of null value related pull requests nowadays, like these: https://github.com/apache/flink/pull/780 https://github.com/apache/flink/pull/831 https://github.com/apache/flink/pull/834 It used to be the case that null values were simply not supported by Flink. Recently,

[jira] [Created] (FLINK-2227) .yarn-properties file is not cleaned up

2015-06-15 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-2227: -- Summary: .yarn-properties file is not cleaned up Key: FLINK-2227 URL: https://issues.apache.org/jira/browse/FLINK-2227 Project: Flink Issue Type: Improvement

Re: Testing Apache Flink 0.9.0-rc2

2015-06-15 Thread Aljoscha Krettek
I created this to help with release testing: https://github.com/aljoscha/FliRTT You just start your cluster and then point the tool to the Flink directory. It will then run all the examples with both builtin data and external data. On Mon, 15 Jun 2015 at 17:15 Maximilian Michels wrote: > Hmm. M

Re: Testing Apache Flink 0.9.0-rc2

2015-06-15 Thread Maximilian Michels
Hmm. Might be interesting to check out whether this is a regression from rc1 to rc2. In any case, it is a serious release blocker and we need to fix it. On Mon, Jun 15, 2015 at 5:04 PM, Till Rohrmann wrote: > I might have found another release blocker. While running some cluster > tests I also t

Re: Testing Apache Flink 0.9.0-rc2

2015-06-15 Thread Till Rohrmann
I might have found another release blocker. While running some cluster tests I also tried to run the `ConnectedComponents` example. However, sometimes the example couldn't be executed because the scheduler could not schedule co-located tasks, `NoResourceAvailableException`, even though it should ha

[jira] [Created] (FLINK-2226) Fail YARN application on failed single-job YARN cluster

2015-06-15 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-2226: - Summary: Fail YARN application on failed single-job YARN cluster Key: FLINK-2226 URL: https://issues.apache.org/jira/browse/FLINK-2226 Project: Flink

[jira] [Created] (FLINK-2225) Erroneous scheduling of co-located tasks

2015-06-15 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-2225: Summary: Erroneous scheduling of co-located tasks Key: FLINK-2225 URL: https://issues.apache.org/jira/browse/FLINK-2225 Project: Flink Issue Type: Bug

[jira] [Created] (FLINK-2224) JobManager log does not contain root cause and stack trace of exceptions

2015-06-15 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-2224: Summary: JobManager log does not contain root cause and stack trace of exceptions Key: FLINK-2224 URL: https://issues.apache.org/jira/browse/FLINK-2224 Project: Flink

Re: Iteration stats logging

2015-06-15 Thread Nam-Luc Tran
Hi Ufuk, The kind of things we'd like to log are: time spent in the iteration, residual of the algorithm (convergence), current iteration. Best regards, Tran Nam-Luc   At Monday, 15/06/2015 on 16:15 Ufuk Celebi wrote: Hey Tran Nam-Luc, there is currently no way to do this. The iteration sync

Re: Iteration stats logging

2015-06-15 Thread Ufuk Celebi
Hey Tran Nam-Luc, there is currently no way to do this. The iteration sync tasks keeps track of iteration convergence/max number of iterations and signals termination to the iteration head. After this, the head flushes the produced result to the next task (after the iteration) and the intermed

[jira] [Created] (FLINK-2223) Web frontend shows wrong accumulator results for latest job

2015-06-15 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-2223: Summary: Web frontend shows wrong accumulator results for latest job Key: FLINK-2223 URL: https://issues.apache.org/jira/browse/FLINK-2223 Project: Flink Is

Iteration stats logging

2015-06-15 Thread Nam-Luc Tran
Hello Everyone, I would like to log certain stats during iterations in a bulk iterative job. The way I do this is store the things I want at each iteration and plan to flush everything to HDFS once all the iterations are done. To do that I would need to know when the last iteration is invoked in o

Re: Testing Apache Flink 0.9.0-rc2

2015-06-15 Thread Ufuk Celebi
Please continue the discussion in the issue Aljoscha opened: https://issues.apache.org/jira/browse/FLINK-2221 I think it is better to only point to issues in this mail thread. Otherwise the discussions are very hard to follow.

[jira] [Created] (FLINK-2222) Memory/CPU graphs in web frontend are not well formatted using Chrome/Safari

2015-06-15 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-: Summary: Memory/CPU graphs in web frontend are not well formatted using Chrome/Safari Key: FLINK- URL: https://issues.apache.org/jira/browse/FLINK- Project: F

[jira] [Created] (FLINK-2221) Checkpoints to "file://" are not cleaned up

2015-06-15 Thread Aljoscha Krettek (JIRA)
Aljoscha Krettek created FLINK-2221: --- Summary: Checkpoints to "file://" are not cleaned up Key: FLINK-2221 URL: https://issues.apache.org/jira/browse/FLINK-2221 Project: Flink Issue Type: B

Re: Testing Apache Flink 0.9.0-rc2

2015-06-15 Thread Gyula Fóra
The checkpoint cleanup works for HDFS right? I assume the job manager should see that as well. This is not a trivial problem in general, so the assumptions we were making now that the JM can actually execute the cleanup logic. Aljoscha Krettek ezt írta (időpont: 2015. jún. 15., H, 15:40): > @Uf

Re: Testing Apache Flink 0.9.0-rc2

2015-06-15 Thread Aljoscha Krettek
Oh yes, on that I agree. I'm just saying that the checkpoint setting should maybe be a central setting. On Mon, 15 Jun 2015 at 15:38 Matthias J. Sax wrote: > Hi, > > IMHO, it is very common that Workers do have their own config files (eg, > Storm works the same way). And I think it make a lot of

Re: Testing Apache Flink 0.9.0-rc2

2015-06-15 Thread Aljoscha Krettek
@Ufuk The cleanup bug for file:// checkpoints is not easy to fix IMHO. On Mon, 15 Jun 2015 at 15:39 Aljoscha Krettek wrote: > Oh yes, on that I agree. I'm just saying that the checkpoint setting > should maybe be a central setting. > > On Mon, 15 Jun 2015 at 15:38 Matthias J. Sax < > mj...@infor

Re: Testing Apache Flink 0.9.0-rc2

2015-06-15 Thread Matthias J. Sax
Hi, IMHO, it is very common that Workers do have their own config files (eg, Storm works the same way). And I think it make a lot of senses. You might run Flink in an heterogeneous cluster and you want to assign different memory and slots for different hardware. This would not be possible using a

Re: Testing Apache Flink 0.9.0-rc2

2015-06-15 Thread Ufuk Celebi
On Mon, Jun 15, 2015 at 3:30 PM, Aljoscha Krettek wrote: > Regarding 1), thats why I said "bugs and features". :D But I think of it as > a bug, since people will normally set in in the flink-conf.yaml on the > master and assume that it works. That's what I assumed and it took me a > while to figu

Re: Testing Apache Flink 0.9.0-rc2

2015-06-15 Thread Aljoscha Krettek
Regarding 1), thats why I said "bugs and features". :D But I think of it as a bug, since people will normally set in in the flink-conf.yaml on the master and assume that it works. That's what I assumed and it took me a while to figure out that the task managers don't respect this setting. Regardin

Re: Testing Apache Flink 0.9.0-rc2

2015-06-15 Thread Márton Balassi
@Aljoscha: 1) I think this just means that you can set the state backend on a taskmanager basis. 3) This is a serious issue then. Is it work when you set it in the flink-conf.yaml? On Mon, Jun 15, 2015 at 3:17 PM, Aljoscha Krettek wrote: > So, during my testing of the state checkpointing on a cl

[jira] [Created] (FLINK-2220) Join on Pojo without hashCode() silently fails

2015-06-15 Thread Marcus Leich (JIRA)
Marcus Leich created FLINK-2220: --- Summary: Join on Pojo without hashCode() silently fails Key: FLINK-2220 URL: https://issues.apache.org/jira/browse/FLINK-2220 Project: Flink Issue Type: Bug

[jira] [Created] (FLINK-2219) JobManagerInfoServlet IllegalArgumentException when pressing state button

2015-06-15 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-2219: -- Summary: JobManagerInfoServlet IllegalArgumentException when pressing state button Key: FLINK-2219 URL: https://issues.apache.org/jira/browse/FLINK-2219 Project: Flink

Re: Testing Apache Flink 0.9.0-rc2

2015-06-15 Thread Aljoscha Krettek
So, during my testing of the state checkpointing on a cluster I discovered several things (bugs and features): - If you have a setup where the configuration is not synced to the workers they do not pick up the state back-end configuration. The workers do not respect the setting in the flink-cont.

[jira] [Created] (FLINK-2218) Web client cannot specify parallelism

2015-06-15 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-2218: -- Summary: Web client cannot specify parallelism Key: FLINK-2218 URL: https://issues.apache.org/jira/browse/FLINK-2218 Project: Flink Issue Type: Improvement

[jira] [Created] (FLINK-2217) Web client does not remove uploaded JARs

2015-06-15 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-2217: -- Summary: Web client does not remove uploaded JARs Key: FLINK-2217 URL: https://issues.apache.org/jira/browse/FLINK-2217 Project: Flink Issue Type: Improvement

[jira] [Created] (FLINK-2216) Examples directory contains `flink-java-examples-0.9.0-javadoc.jar`

2015-06-15 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-2216: -- Summary: Examples directory contains `flink-java-examples-0.9.0-javadoc.jar` Key: FLINK-2216 URL: https://issues.apache.org/jira/browse/FLINK-2216 Project: Flink

[jira] [Created] (FLINK-2215) Document how to use the HCatInputFormat

2015-06-15 Thread Timo Walther (JIRA)
Timo Walther created FLINK-2215: --- Summary: Document how to use the HCatInputFormat Key: FLINK-2215 URL: https://issues.apache.org/jira/browse/FLINK-2215 Project: Flink Issue Type: Improvement

[jira] [Created] (FLINK-2214) ALS predict and empiricalRisk function can cause hash join function to exceed maximum number of recursions

2015-06-15 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-2214: Summary: ALS predict and empiricalRisk function can cause hash join function to exceed maximum number of recursions Key: FLINK-2214 URL: https://issues.apache.org/jira/browse/FLIN

[jira] [Created] (FLINK-2213) Configure number of vcores

2015-06-15 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-2213: -- Summary: Configure number of vcores Key: FLINK-2213 URL: https://issues.apache.org/jira/browse/FLINK-2213 Project: Flink Issue Type: Improvement Compon

[jira] [Created] (FLINK-2212) Exit code on failed job

2015-06-15 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-2212: -- Summary: Exit code on failed job Key: FLINK-2212 URL: https://issues.apache.org/jira/browse/FLINK-2212 Project: Flink Issue Type: Improvement Component

Re: GSA based on edge direction

2015-06-15 Thread Vasiliki Kalavri
Hi Pieter, we have not added the option to choose message direction in the GSA iterations yet, but this should be added soon (related JIRA: https://issues.apache.org/jira/browse/FLINK-2141). I guess the way you suggest, running GSA on the graph after reversing the edges, is how to do this now, yes

Re: Listing Apache-2.0 dependencies in LICENSE file

2015-06-15 Thread Ufuk Celebi
To summarize: 1. Your PR changes are necessary. Thanks for doing it. 2. The consensus (PR comments + ML) is to skip other Apache licensed dependencies. 3. Shaded Jars need LICENSE and NOTICE in META-INF. Let's wrap this up today and get it out of the way of the release. :-) – Ufuk On 15 Jun

Re: Apache Flink 0.9 ALS API

2015-06-15 Thread Till Rohrmann
+1 for longs as IDs. Not so much in favour of Strings for the user ID because the row index could also denote the actual item ID if you swap the indices. Furthermore, you can always add a transformer which assigns unique IDs to names. Cheers, Till On Sat, Jun 13, 2015 at 3:34 PM Chiwan Park wro

Re: Apache Flink 0.9 ALS API

2015-06-15 Thread Till Rohrmann
+1 for using long for both IDs. But I don't understand what's the advantage of using a String as user ID. On Sun, Jun 14, 2015 at 6:43 PM Robert Metzger wrote: > Hi Ronny, > > I accepted your previous mail to the mailing list, you got two replies: > > http://apache-flink-mailing-list-archive.10

Re: Listing Apache-2.0 dependencies in LICENSE file

2015-06-15 Thread Till Rohrmann
Hi Henry, there are actually two licensing questions and one update for the current release going on but all of them are orthogonal and therefore I would like to keep them separate. The PR [1] which you referred to are the necessary updates for the source and binary distribution of the upcoming r