[jira] [Created] (FLINK-5557) Fix link in library methods
Greg Hogan created FLINK-5557: - Summary: Fix link in library methods Key: FLINK-5557 URL: https://issues.apache.org/jira/browse/FLINK-5557 Project: Flink Issue Type: Improvement Components: Gelly Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Trivial Fix For: 1.2.0 The link to "Towards real-time community detection in large networks" is padded with unnecessary and seemingly malformed text. https://ci.apache.org/projects/flink/flink-docs-master/dev/libs/gelly/library_methods.html#community-detection -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-5558) Replace TriangleCount with a Count analytic
Greg Hogan created FLINK-5558: - Summary: Replace TriangleCount with a Count analytic Key: FLINK-5558 URL: https://issues.apache.org/jira/browse/FLINK-5558 Project: Flink Issue Type: Improvement Components: Gelly Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor {{TriangleCount}} can be replaced by a generic {{Count}} analytic for {{DataSet}}. The analytics currently using {{TriangleCount}} can simply use {{TriangleListing}} and {{Count}}. Gelly includes both directed and undirected versions of {{TriangleListing}} and therefore two versions of {{TriangleCount}} which will be replaced by a single {{Count}} analytic which can be reused elsewhere. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-5563) Add density to vertex metrics
Greg Hogan created FLINK-5563: - Summary: Add density to vertex metrics Key: FLINK-5563 URL: https://issues.apache.org/jira/browse/FLINK-5563 Project: Flink Issue Type: Improvement Components: Gelly Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-5562) Driver fixes
Greg Hogan created FLINK-5562: - Summary: Driver fixes Key: FLINK-5562 URL: https://issues.apache.org/jira/browse/FLINK-5562 Project: Flink Issue Type: Bug Components: Gelly Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Fix For: 1.2.0 Improve parametrization and output formatting. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4571) Configurable little parallelism in Gelly drivers
Greg Hogan created FLINK-4571: - Summary: Configurable little parallelism in Gelly drivers Key: FLINK-4571 URL: https://issues.apache.org/jira/browse/FLINK-4571 Project: Flink Issue Type: Improvement Components: Gelly Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor Several Gelly library implementations support a configurable "little parallelism" which is important when scaling to large data sets. These algorithms include operators at the beginning and end which process data on the order of the original DataSet, as well as middle operators that exchange 100s or 1000s more data. The "little parallelism" should be configurable in the appropriate Gelly drivers in the flink-gelly-examples module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4572) Convert to negative in LongValueToIntValue
Greg Hogan created FLINK-4572: - Summary: Convert to negative in LongValueToIntValue Key: FLINK-4572 URL: https://issues.apache.org/jira/browse/FLINK-4572 Project: Flink Issue Type: Bug Components: Gelly Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor The Gelly drivers expect that scale 32 edges, represented by the lower 32 bits of {{long}} values, can be converted to {{int}} values. Values between 2^31 and 2^32 - 1 should be converted to negative integers, which is not supported by {{MathUtils.checkedDownCast}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4583) NullPointerException in CliFrontend
Greg Hogan created FLINK-4583: - Summary: NullPointerException in CliFrontend Key: FLINK-4583 URL: https://issues.apache.org/jira/browse/FLINK-4583 Project: Flink Issue Type: Bug Components: Client Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor If no Flink program is executed the following exception message is printed. This can happen when a driver prints usage due to insufficient or improper configuration. {noformat} The program finished with the following exception: java.lang.NullPointerException at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:781) at org.apache.flink.client.CliFrontend.run(CliFrontend.java:250) at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1002) at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1045) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4740) Upgrade testing libraries
Greg Hogan created FLINK-4740: - Summary: Upgrade testing libraries Key: FLINK-4740 URL: https://issues.apache.org/jira/browse/FLINK-4740 Project: Flink Issue Type: Improvement Components: Tests Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor JUnit 4.12 was released 4 Dec 2014. Flink is currently using JUnit 4.11 from 14 Nov 2012. PowerMock reports "org.powermock.reflect.exceptions.FieldNotFoundException: Field 'fTestClass' was not found in class org.junit.internal.runners.MethodValidator." https://github.com/jayway/powermock/issues/551 This is fixed in PowerMock 1.6.1+ (currently using 1.5.5, latest is 1.6.5): https://raw.githubusercontent.com/jayway/powermock/master/changelog.txt Then Mockito causes "java.lang.NoSuchMethodError: org.mockito.mock.MockCreationSettings.getSerializableMode()Lorg/mockito/mock/SerializableMode;". This is fixed by upgrading Mockito from 1.9.5 to the latest 1.10.19. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4734) Remove use of Tuple setField for fixed position
Greg Hogan created FLINK-4734: - Summary: Remove use of Tuple setField for fixed position Key: FLINK-4734 URL: https://issues.apache.org/jira/browse/FLINK-4734 Project: Flink Issue Type: Improvement Components: Gelly Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor Fix For: 1.2.0 Use {{tuple.f0 = value;}} rather than {{tuple.setField(value, 0);}}. Can the latter be optimized by the JVM? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4736) Don't duplicate fields in Ordering
Greg Hogan created FLINK-4736: - Summary: Don't duplicate fields in Ordering Key: FLINK-4736 URL: https://issues.apache.org/jira/browse/FLINK-4736 Project: Flink Issue Type: Improvement Components: Core Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor Fix For: 1.2.0 Duplicate fields should not be appended to an ordering. In an ordering each subsequent field is only used as a comparison when all prior fields test equal; therefore, a repeated field cannot contribute to the ordering. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4643) Average Clustering Coefficient
Greg Hogan created FLINK-4643: - Summary: Average Clustering Coefficient Key: FLINK-4643 URL: https://issues.apache.org/jira/browse/FLINK-4643 Project: Flink Issue Type: New Feature Components: Gelly Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor Gelly has Global Clustering Coefficient and Local Clustering Coefficient. This adds Average Clustering Coefficient. The distinction is discussed in [http://jponnela.com/web_documents/twomode.pdf] (pdf page 2, document page 32). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4673) TypeFactory for Either type
Greg Hogan created FLINK-4673: - Summary: TypeFactory for Either type Key: FLINK-4673 URL: https://issues.apache.org/jira/browse/FLINK-4673 Project: Flink Issue Type: Improvement Components: Core Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor I was able to resolve the requirement to specify an explicit {{TypeInformation}} in the pull request for FLINK-4624 by creating a {{TypeInfoFactory}} for the {{Either}} type. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4664) Add translator to NullValue
Greg Hogan created FLINK-4664: - Summary: Add translator to NullValue Key: FLINK-4664 URL: https://issues.apache.org/jira/browse/FLINK-4664 Project: Flink Issue Type: New Feature Components: Gelly Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor Fix For: 1.2.0 Existing translators convert from LongValue (the output label type of graph generators) to IntValue, StringValue, and an offset LongValue. Translators can also be used to convert vertex or edge values. This translator will be appropriate for translating these vertex or edge values to NullValue when the values are not used in an algorithm. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4594) Validate lower bound in MathUtils.checkedDownCast
Greg Hogan created FLINK-4594: - Summary: Validate lower bound in MathUtils.checkedDownCast Key: FLINK-4594 URL: https://issues.apache.org/jira/browse/FLINK-4594 Project: Flink Issue Type: Bug Components: Core Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Trivial {{MathUtils.checkedDownCast}} only compares against the upper bound {{Integer.MAX_VALUE}}, which has worked with current usage. Rather than adding a second comparison we can replace {noformat} if (value > Integer.MAX_VALUE) { {noformat} with a cast and check {noformat} if ((int)value != value) { ... {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4452) TaskManager network buffer guages
Greg Hogan created FLINK-4452: - Summary: TaskManager network buffer guages Key: FLINK-4452 URL: https://issues.apache.org/jira/browse/FLINK-4452 Project: Flink Issue Type: New Feature Components: Metrics Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor Add guages for {{network.getNetworkBufferPool.getTotalNumberOfMemorySegments}} and {{network.getNetworkBufferPool.getNumberOfAvailableMemorySegments}}. Providing insight into the number and proportion of used network buffers is vital and enlightening. Jobs terminate when buffers are not available, but also the rule-of-thumb for "Configuring the Network Buffers" from the documentation is way off. For example, running a sort on a single TaskManager with 8 slots I am using 16,000+ buffers which is much greater than 8*8*4 = 256. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4447) Include NettyConfig options on Configurations page
Greg Hogan created FLINK-4447: - Summary: Include NettyConfig options on Configurations page Key: FLINK-4447 URL: https://issues.apache.org/jira/browse/FLINK-4447 Project: Flink Issue Type: Improvement Components: Documentation Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Trivial Fix For: 1.2.0 {{NettyConfig}} looks for the following configuration options which are not listed in the Flink documentation. {noformat} public static final String NUM_ARENAS = "taskmanager.net.num-arenas"; public static final String NUM_THREADS_SERVER = "taskmanager.net.server.numThreads"; public static final String NUM_THREADS_CLIENT = "taskmanager.net.client.numThreads"; public static final String CONNECT_BACKLOG = "taskmanager.net.server.backlog"; public static final String CLIENT_CONNECT_TIMEOUT_SECONDS = "taskmanager.net.client.connectTimeoutSec"; public static final String SEND_RECEIVE_BUFFER_SIZE = "taskmanager.net.sendReceiveBufferSize"; public static final String TRANSPORT_TYPE = "taskmanager.net.transport"; {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4481) Maximum results for pairwise algorithms
Greg Hogan created FLINK-4481: - Summary: Maximum results for pairwise algorithms Key: FLINK-4481 URL: https://issues.apache.org/jira/browse/FLINK-4481 Project: Flink Issue Type: New Feature Components: Gelly Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor Return the per-vertex maximum scores for algorithms ({{AdamicAdar}}, {{JaccardIndex}}) which return pairwise results. The number of pairwise scores can be >> O(edges) but the number of maximum scores is O(vertices). It can also be most useful to know what vertices a vertex is most similar to. This implementation is very efficient through use of the hash-combine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4705) Instrument FixedLengthRecordSorter
Greg Hogan created FLINK-4705: - Summary: Instrument FixedLengthRecordSorter Key: FLINK-4705 URL: https://issues.apache.org/jira/browse/FLINK-4705 Project: Flink Issue Type: Improvement Components: Local Runtime Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan The {{NormalizedKeySorter}} sorts on the concatenation of (potentially partial) keys plus an 8-byte pointer to the record. After sorting each pointer must be dereferenced, which is not cache friendly. The {{FixedLengthRecordSorter}} sorts on the concatentation of full keys followed by the remainder of the record. The records can then be deserialized in sequence. Instrumenting the {{FixedLengthRecordSorter}} requires implementing the comparator methods {{writereadWithKeyNormalization}} and {{readWithKeyNormalization}}. Testing {{JaccardIndex}} on an m4.16xlarge the scale 18 runtime dropped from 71.8 to 68.8 s (4.3% faster) and the scale 20 runtime dropped from 546.1 to 501.8 s (8.8% faster). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4729) Use optional VertexCentric CombineFunction
Greg Hogan created FLINK-4729: - Summary: Use optional VertexCentric CombineFunction Key: FLINK-4729 URL: https://issues.apache.org/jira/browse/FLINK-4729 Project: Flink Issue Type: Improvement Components: Gelly Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor Fix For: 1.2.0 Passes through the {{CombineFunction}} to {{VertexCentricIteration}}, and other code cleanup discovered via IntelliJ's code analyzer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4728) Replace reference equality with object equality
Greg Hogan created FLINK-4728: - Summary: Replace reference equality with object equality Key: FLINK-4728 URL: https://issues.apache.org/jira/browse/FLINK-4728 Project: Flink Issue Type: Improvement Components: Core, Optimizer Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor Fix For: 1.2.0 Some cases of testing {{Integer}} equality using {{==}} rather than {{Integer.equals(Integer)}}, and some additional cleanup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4936) Operator names for Gelly inputs
Greg Hogan created FLINK-4936: - Summary: Operator names for Gelly inputs Key: FLINK-4936 URL: https://issues.apache.org/jira/browse/FLINK-4936 Project: Flink Issue Type: Improvement Components: Gelly Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor Provider descriptive operator names for Gelly's {{Graph}} and {{GraphCsvReader}}. Also, condense multiple type conversion maps into a single mapper. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4946) Load jar files from subdirectories of lib
Greg Hogan created FLINK-4946: - Summary: Load jar files from subdirectories of lib Key: FLINK-4946 URL: https://issues.apache.org/jira/browse/FLINK-4946 Project: Flink Issue Type: Improvement Components: Startup Shell Scripts Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor Users can more easily track Flink jars with transitive dependencies when copied into subdirectories of {{lib}}. This is the arrangement of {{opt}} for FLINK-4861. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4963) Tabulate edge direction for directed VertexMetrics
Greg Hogan created FLINK-4963: - Summary: Tabulate edge direction for directed VertexMetrics Key: FLINK-4963 URL: https://issues.apache.org/jira/browse/FLINK-4963 Project: Flink Issue Type: Improvement Components: Gelly Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor The current implementation simply counts edges. We can do one better and tabulate unidirectional (u:v but no v:u) and bidirectional edges (u:v and v:u). This is effectively the ['dyadic census'|http://file.scirp.org/pdf/SN_2013012915270187.pdf]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4966) BetweennessCentrality
Greg Hogan created FLINK-4966: - Summary: BetweennessCentrality Key: FLINK-4966 URL: https://issues.apache.org/jira/browse/FLINK-4966 Project: Flink Issue Type: New Feature Components: Gelly Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Betweenness Centrality weights each vertex or edge by the proportion of shortest paths on which it participates. https://en.wikipedia.org/wiki/Betweenness_centrality -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4965) AllPairsShortestPaths
Greg Hogan created FLINK-4965: - Summary: AllPairsShortestPaths Key: FLINK-4965 URL: https://issues.apache.org/jira/browse/FLINK-4965 Project: Flink Issue Type: New Feature Components: Gelly Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Add a Gelly implementation of {{AllPairsShortestPaths}} to complement the existing {{SingleSourceShortestPaths}}. Flink algorithms excel at processing big, sparse data. APSP is big, really big, but not at all sparse. Considering only undirected graphs, each component of size {{n}} will have {{n choose 2}} shortest paths (1,000 vertices => ~million paths, 1,000,000 vertices => ~trillion shortest paths). Considerations are directed vs undirected and weighted vs unweighted graphs. The actual shortest path (not merely the distance) is required for follow-on algorithms such as betweenness centrality. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4869) Store record pointer after record keys
Greg Hogan created FLINK-4869: - Summary: Store record pointer after record keys Key: FLINK-4869 URL: https://issues.apache.org/jira/browse/FLINK-4869 Project: Flink Issue Type: Sub-task Components: Core Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor {{NormalizedKeySorter}} serializes records into a {{RandomAccessInputView}} separate from the memory segments used for the sort keys. By storing the pointer after the sort keys the addition of the offset is moved from {{NormalizedKeySorter.compare}} which is O(n log n)) to other methods which are O\(n). Will run a performance comparison before submitting a PR to how significant a performance improvement this would yield. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4896) PageRank algorithm for directed graphs
Greg Hogan created FLINK-4896: - Summary: PageRank algorithm for directed graphs Key: FLINK-4896 URL: https://issues.apache.org/jira/browse/FLINK-4896 Project: Flink Issue Type: New Feature Components: Gelly Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Gelly includes PageRank implementations for scatter-gather and gather-sum-apply. Both ship with the warning "The implementation assumes that each page has at least one incoming and one outgoing link." PageRank is a directed algorithm and sources and sinks are common in directed graphs. Sinks drain the total score across the graph which affects convergence and the balance of the random hop (convergence is not currently a feature of Gelly's PageRanks as this a very recent feature from FLINK-3888). Sources are handled nicely by the algorithm highlighted on Flink's features page under "Iterations and Delta Iterations" since score deltas are transmitted and a source's score never changes (is always equal to the random hop probability divided by the vertex count). https://flink.apache.org/features.html We should find an implementation featuring convergence and unrestricted processing of directed graphs and move other implementations to Gelly examples. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4860) Sort performance
Greg Hogan created FLINK-4860: - Summary: Sort performance Key: FLINK-4860 URL: https://issues.apache.org/jira/browse/FLINK-4860 Project: Flink Issue Type: Improvement Reporter: Greg Hogan A super-task for improvements to Flink's sort performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4861) Package optional project artifacts
Greg Hogan created FLINK-4861: - Summary: Package optional project artifacts Key: FLINK-4861 URL: https://issues.apache.org/jira/browse/FLINK-4861 Project: Flink Issue Type: New Feature Components: Build System Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Fix For: 1.2.0 Per the mailing list [discussion|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Additional-project-downloads-td13223.html], package the Flink libraries and connectors into subdirectories of a new {{opt}} directory in the release/snapshot tarballs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4970) Parameterize vertex value for SSSP
Greg Hogan created FLINK-4970: - Summary: Parameterize vertex value for SSSP Key: FLINK-4970 URL: https://issues.apache.org/jira/browse/FLINK-4970 Project: Flink Issue Type: Improvement Components: Gelly Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor Fix For: 1.2.0 {{SingleSourceShortestPaths}} and {{GSASingleSourceShortestPaths}} require the input {{Graph}} to provide {{Double}} vertex values. These incoming values are unused so can be replaced with a parameterized type. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4996) Make CrossHint @Public
Greg Hogan created FLINK-4996: - Summary: Make CrossHint @Public Key: FLINK-4996 URL: https://issues.apache.org/jira/browse/FLINK-4996 Project: Flink Issue Type: Improvement Components: Core Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Trivial Fix For: 1.2.0 {{CrossHint}} should be annotated {{@Public}} as is {{JoinHint}}. It is currently marked {{@Internal}} by its enclosing class {{CrossOperatorBase}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4934) Triadic Census
Greg Hogan created FLINK-4934: - Summary: Triadic Census Key: FLINK-4934 URL: https://issues.apache.org/jira/browse/FLINK-4934 Project: Flink Issue Type: New Feature Components: Gelly Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan A triad is any three vertices in a graph. An undirected graph has 4 types of triads (with 0, 1, 2, or 3 edges among the three vertices) and a directed graph has 16 types (http://vlado.fmf.uni-lj.si/pub/networks/doc/triads/triads.pdf). This can be implemented as an analytic. The undirected implementation will use {{VertexMetrics}} and {{TriangleCount}}. The directed implementation will use {{VertexDegrees}} and {{TriangleListing}} with postprocessing. This could be added to the {{TriangleListing}} driver in Gelly examples. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-5209) Fix TaskManager metrics
Greg Hogan created FLINK-5209: - Summary: Fix TaskManager metrics Key: FLINK-5209 URL: https://issues.apache.org/jira/browse/FLINK-5209 Project: Flink Issue Type: Bug Components: Webfrontend Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Fix For: 1.2.0 Properly propagate the network and non-JVM memory metrics to the web UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-5485) Mark compiled web frontend files as binary when processed by git diff
Greg Hogan created FLINK-5485: - Summary: Mark compiled web frontend files as binary when processed by git diff Key: FLINK-5485 URL: https://issues.apache.org/jira/browse/FLINK-5485 Project: Flink Issue Type: Improvement Components: Webfrontend Affects Versions: 1.3.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Trivial Fix For: 1.3.0 Particularly beneficial now that javascript is minified, we can mark compiled web frontend files as binary when processed by git diff. https://linux.die.net/man/5/gitattributes This does not affect how files are displayed by github. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-5461) Remove Superflous TypeInformation Declaration
Greg Hogan created FLINK-5461: - Summary: Remove Superflous TypeInformation Declaration Key: FLINK-5461 URL: https://issues.apache.org/jira/browse/FLINK-5461 Project: Flink Issue Type: Improvement Components: Gelly Affects Versions: 1.3.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor Fix For: 1.3.0 FLINK-4624 updated Gelly's Summarization algorithm to use {{Either}} in order to support types for which the serialization does not support null values. This required the use of explicit {{TypeInformation}} due to {{TypeExtractor}}. FLINK-4673 created a {{TypeInfoFactory}} for {{EitherType}} so the explicit {{TypeInformation}} can be removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-6195) Move gelly-examples jar from opt to examples
Greg Hogan created FLINK-6195: - Summary: Move gelly-examples jar from opt to examples Key: FLINK-6195 URL: https://issues.apache.org/jira/browse/FLINK-6195 Project: Flink Issue Type: Sub-task Components: Gelly Affects Versions: 1.3.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Trivial Fix For: 1.3.0 The {{opt}} directory should be reserved for Flink JARs which users may optionally move to {{lib}} to be loaded by the runtime. {{flink-gelly-examples}} is a user program so is being moved to the {{examples}} folder. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-6358) Write job details for Gelly examples
Greg Hogan created FLINK-6358: - Summary: Write job details for Gelly examples Key: FLINK-6358 URL: https://issues.apache.org/jira/browse/FLINK-6358 Project: Flink Issue Type: Improvement Components: Gelly Affects Versions: 1.3.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor Add an option to write job details to a file in JSON format. Job details include: job ID, runtime, parameters with values, and accumulators with values. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-6357) Parametertool get unrequested parameters
Greg Hogan created FLINK-6357: - Summary: Parametertool get unrequested parameters Key: FLINK-6357 URL: https://issues.apache.org/jira/browse/FLINK-6357 Project: Flink Issue Type: Improvement Components: Java API Affects Versions: 1.3.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor The Gelly examples use {{ParameterTool}} to parse required and optional parameters. In the latter case we should detect if a user mistypes a parameter name. I would like to add a {{Set getUnrequestedParameters()}} method returning parameter names not requested by {{has}} or any of the {{get}} methods. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-6009) Deprecate DataSetUtils#checksumHashCode
Greg Hogan created FLINK-6009: - Summary: Deprecate DataSetUtils#checksumHashCode Key: FLINK-6009 URL: https://issues.apache.org/jira/browse/FLINK-6009 Project: Flink Issue Type: Improvement Components: Java API Affects Versions: 1.3.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Trivial Fix For: 1.3.0 This is likely only used by Gelly and we have a more featureful implementation allowing for multiple outputs and setting the job name. Deprecation will allow this to be removed in Flink 2.0. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-5945) Close function in OuterJoinOperatorBase#executeOnCollections
Greg Hogan created FLINK-5945: - Summary: Close function in OuterJoinOperatorBase#executeOnCollections Key: FLINK-5945 URL: https://issues.apache.org/jira/browse/FLINK-5945 Project: Flink Issue Type: Bug Components: Core Affects Versions: 1.1.4, 1.2.0, 1.3.0 Reporter: Greg Hogan Assignee: Greg Hogan Fix For: 1.3.0, 1.2.1, 1.1.4 {{OuterJoinOperatorBase#executeOnCollections}} does not call {{FunctionUtils.closeFunction(function);}}. I am seeing this affect the Gelly test for the {{HITS}} algorithm when using a convergence threshold rather than a fixed number of iterations. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-6268) Object reuse for Either type
Greg Hogan created FLINK-6268: - Summary: Object reuse for Either type Key: FLINK-6268 URL: https://issues.apache.org/jira/browse/FLINK-6268 Project: Flink Issue Type: Improvement Components: Core Affects Versions: 1.3.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor While reviewing test coverage for FLINK-4705 I have come across that {{Either}} only implements partial object reuse (when from and to are both {{Right}}). We can implement full object reuse if {{Left}} stores a reference to a {{Right}} and {{Right}} to a {{Left}}. These references will be {{private}} and will remain {{null}} until set by {{EitherSerializer}} when copying or deserializing with object reuse. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-6280) Allow logging with Java flags
Greg Hogan created FLINK-6280: - Summary: Allow logging with Java flags Key: FLINK-6280 URL: https://issues.apache.org/jira/browse/FLINK-6280 Project: Flink Issue Type: Improvement Components: Startup Shell Scripts Affects Versions: 1.3.0 Reporter: Greg Hogan Assignee: Greg Hogan Allow configuring Flink's Java options with the logging prefix and log rotation. For example, this allows the following configurations to write {{.jfr}} and {{.jit}} files alongside the existing {{.log}} and {{.out}} files. {code:language=bash|title=Configuration for Java Flight Recorder} env.java.opts: "-XX:+UnlockCommercialFeatures -XX:+UnlockDiagnosticVMOptions -XX:+FlightRecorder -XX:+DebugNonSafepoints -XX:FlightRecorderOptions=defaultrecording=true,dumponexit=true,dumponexitpath=${LOG_PREFIX}.jfr" {code} {code:language=bash|title=Configuration for JitWatch} env.java.opts: "-XX:+UnlockDiagnosticVMOptions -XX:+TraceClassLoading -XX:+LogCompilation -XX:LogFile=${LOG_PREFIX}.jit -XX:+PrintAssembly" {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-7296) Validate commit messages in git pre-receive hook
Greg Hogan created FLINK-7296: - Summary: Validate commit messages in git pre-receive hook Key: FLINK-7296 URL: https://issues.apache.org/jira/browse/FLINK-7296 Project: Flink Issue Type: Improvement Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor Would like to investigate a pre-receive (server-side) hook analyzing the commit message incoming revisions on the {{master}} branch for the standard JIRA format ({{\[FLINK-\] \[module\] ...}}). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7204) CombineHint.NONE
Greg Hogan created FLINK-7204: - Summary: CombineHint.NONE Key: FLINK-7204 URL: https://issues.apache.org/jira/browse/FLINK-7204 Project: Flink Issue Type: New Feature Components: Core Affects Versions: 1.4.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor FLINK-3477 added a hash-combine preceding the reducer configured with {{CombineHint.HASH}} or {{CombineHint.SORT}} (default). In some cases it may be useful to disable the combiner in {{ReduceNode}} by specifying a new {{CombineHint.NONE}} value. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7211) Exclude Gelly javadoc jar from release
Greg Hogan created FLINK-7211: - Summary: Exclude Gelly javadoc jar from release Key: FLINK-7211 URL: https://issues.apache.org/jira/browse/FLINK-7211 Project: Flink Issue Type: Improvement Components: Build System Affects Versions: 1.4.0, 1.3.2 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Trivial -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7154) Missing call to build CsvTableSource example
Greg Hogan created FLINK-7154: - Summary: Missing call to build CsvTableSource example Key: FLINK-7154 URL: https://issues.apache.org/jira/browse/FLINK-7154 Project: Flink Issue Type: Bug Components: Documentation Affects Versions: 1.4.0, 1.3.2 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Trivial The Java and Scala example code for CsvTableSource create a builder but are missing the final call to {{build}}. https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/table/sourceSinks.html#csvtablesource -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7199) Graph simplification does not set parallelism
Greg Hogan created FLINK-7199: - Summary: Graph simplification does not set parallelism Key: FLINK-7199 URL: https://issues.apache.org/jira/browse/FLINK-7199 Project: Flink Issue Type: Bug Components: Gelly Affects Versions: 1.3.1, 1.4.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor The {{Simplify}} parameter should accept and set the parallelism when calling the {{Simplify}} algorithms. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7273) Gelly tests with empty graphs
Greg Hogan created FLINK-7273: - Summary: Gelly tests with empty graphs Key: FLINK-7273 URL: https://issues.apache.org/jira/browse/FLINK-7273 Project: Flink Issue Type: Bug Components: Gelly Affects Versions: 1.4.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor Fix For: 1.4.0 There exist some tests with empty graphs but the `EmptyGraph` in `AsmTestBase` contained vertices but no edges. Add a new `EmptyGraph` without vertices and test both empty graphs for each algorithm. `PageRank` should (optionally?) include zero-degree vertices in the results. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7277) Weighted PageRank
Greg Hogan created FLINK-7277: - Summary: Weighted PageRank Key: FLINK-7277 URL: https://issues.apache.org/jira/browse/FLINK-7277 Project: Flink Issue Type: New Feature Components: Gelly Affects Versions: 1.4.0 Reporter: Greg Hogan Assignee: Greg Hogan Add a weighted PageRank algorithm to complement the existing unweighted implementation. Edge values store a `double` weight value which is summed per vertex in place of the vertex degree. The vertex score is joined as the fraction of vertex weight rather than dividing by the vertex degree. The examples `Runner` must now read and generated weighted graphs. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7275) Differentiate between normal and power-user cli options in Gelly examples
Greg Hogan created FLINK-7275: - Summary: Differentiate between normal and power-user cli options in Gelly examples Key: FLINK-7275 URL: https://issues.apache.org/jira/browse/FLINK-7275 Project: Flink Issue Type: Improvement Components: Gelly Affects Versions: 1.4.0 Reporter: Greg Hogan Assignee: Greg Hogan The current "hack" is to preface "power-user" options with a double underscore (i.e. '__parallelism') which are then "hidden" by exclusion from the program usage documentation. Change this to instead be explicit in the {{Parameter}} API and provide a cli option to display "power-user" options. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7276) Gelly algorithm parameters
Greg Hogan created FLINK-7276: - Summary: Gelly algorithm parameters Key: FLINK-7276 URL: https://issues.apache.org/jira/browse/FLINK-7276 Project: Flink Issue Type: Improvement Components: Gelly Affects Versions: 1.4.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor Similar to the examples drivers, the algorithm configuration fields should be typed to handle `canMergeConfiguration` and `mergeConfiguration` in `GraphAlgorithmWrappingBase` rather than overriding these methods in each algorithm (which has proven brittle). The existing `OptionalBoolean` is one example. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7234) Fix CombineHint documentation
Greg Hogan created FLINK-7234: - Summary: Fix CombineHint documentation Key: FLINK-7234 URL: https://issues.apache.org/jira/browse/FLINK-7234 Project: Flink Issue Type: Bug Components: Documentation Affects Versions: 1.2.2, 1.4.0, 1.3.2 Reporter: Greg Hogan Assignee: Greg Hogan The {{CombineHint}} [documentation|https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/batch/index.html] applies to {{DataSet#reduce}} not {{DataSet#reduceGroup}} and should also be note for {{DataSet#distinct}}. It is also set with {{.setCombineHint(CombineHint)}} rather than alongside the UDF parameter. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7031) Document Gelly examples
Greg Hogan created FLINK-7031: - Summary: Document Gelly examples Key: FLINK-7031 URL: https://issues.apache.org/jira/browse/FLINK-7031 Project: Flink Issue Type: New Feature Components: Documentation Affects Versions: 1.4.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor Fix For: 1.4.0 The components comprising the Gelly examples runner (inputs, outputs, drivers, and soon transforms) were initially developed for internal Gelly use. As such, the Gelly documentation covers execution of the drivers but does not document the design and structure. The runner has become sufficiently advanced and integral to the development of new Gelly algorithms to warrant a page of documentation. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7042) Fix jar file discovery in YARN tests
Greg Hogan created FLINK-7042: - Summary: Fix jar file discovery in YARN tests Key: FLINK-7042 URL: https://issues.apache.org/jira/browse/FLINK-7042 Project: Flink Issue Type: Bug Components: YARN Reporter: Greg Hogan Assignee: Greg Hogan Priority: Critical Running a local {{mvn clean verify}} the following error in {{org.apache.flink.yarn.YARNSessionCapacitySchedulerITCase#perJobYarnClusterWithParallelism}} is caused by the discovery of a spurious file created by an earlier YARN test. {code} 15:45:16,627 INFO org.apache.flink.yarn.YarnTestBase - Running with args [run, -p, 2, -m, yarn-cluster, -yj, /home/ec2-user/flink-upstream/flink-yarn-tests/../flink-dist/target/flink-1.4-SNAPSHOT-bin/flink-1.4-SNAPSHOT/lib/flink-dist_2.10-1.4-SNAPSHOT.jar, -yt, /home/ec2-user/flink-upstream/flink-yarn-tests/../flink-dist/target/flink-1.4-SNAPSHOT-bin/flink-1.4-SNAPSHOT/lib, -yn, 1, -yjm, 768, -ytm, 1024, /home/ec2-user/flink-upstream/flink-yarn-tests/../flink-yarn-tests/target/flink-yarn-tests-capacityscheduler/flink-yarn-tests-capacityscheduler-localDir-nm-1_0/usercache/ec2-user/appcache/application_1498751075681_0001/filecache/13/.tmp_flink-examples-batch_2.10-1.4-SNAPSHOT-WordCount.jar.crc] 15:45:16,628 INFO org.apache.flink.client.CliFrontend - Using configuration directory /home/ec2-user/flink-upstream/flink-yarn-tests/../flink-dist/target/flink-1.4-SNAPSHOT-bin/flink-1.4-SNAPSHOT/conf 15:45:16,628 INFO org.apache.flink.client.CliFrontend - Trying to load configuration file 15:45:16,628 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.address, localhost 15:45:16,629 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.port, 6123 15:45:16,629 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.heap.mb, 1024 15:45:16,629 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.heap.mb, 1024 15:45:16,629 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.numberOfTaskSlots, 1 15:45:16,629 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.memory.preallocate, false 15:45:16,629 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: parallelism.default, 1 15:45:16,629 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.web.port, 8081 15:45:16,629 INFO org.apache.flink.client.CliFrontend - Running 'run' command. 15:45:16,629 INFO org.apache.flink.client.CliFrontend - Building program from JAR file 15:45:16,630 ERROR org.apache.flink.client.CliFrontend - Error while running the command. org.apache.flink.client.program.ProgramInvocationException: Error while opening jar file '/home/ec2-user/flink-upstream/flink-yarn-tests/../flink-yarn-tests/target/flink-yarn-tests-capacityscheduler/flink-yarn-tests-capacityscheduler-localDir-nm-1_0/usercache/ec2-user/appcache/application_1498751075681_0001/filecache/13/.tmp_flink-examples-batch_2.10-1.4-SNAPSHOT-WordCount.jar.crc'. error in opening zip file at org.apache.flink.client.program.PackagedProgram.getEntryPointClassNameFromJar(PackagedProgram.java:562) at org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:188) at org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:126) at org.apache.flink.client.CliFrontend.buildProgram(CliFrontend.java:900) at org.apache.flink.client.CliFrontend.run(CliFrontend.java:229) at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1083) at org.apache.flink.yarn.YarnTestBase$Runner.run(YarnTestBase.java:657) Caused by: java.util.zip.ZipException: error in opening zip file at java.util.zip.ZipFile.open(Native Method) at java.util.zip.ZipFile.(ZipFile.java:219) at java.util.zip.ZipFile.(ZipFile.java:149) at java.util.jar.JarFile.(JarFile.java:166) at java.util.jar.JarFile.(JarFile.java:130) at org.apache.flink.client.program.PackagedProgram.getEntryPointClassNameFromJar(PackagedProgram.java:557) ... 6 more 15:45:16,632 INFO org.apache.flink.yarn.YarnTestBase - Runner stopped with exception {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7039) Increase forkCountTestPackage for sudo-based Trav
Greg Hogan created FLINK-7039: - Summary: Increase forkCountTestPackage for sudo-based Trav Key: FLINK-7039 URL: https://issues.apache.org/jira/browse/FLINK-7039 Project: Flink Issue Type: Bug Components: Build System Affects Versions: 1.4.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Trivial Fix For: 1.4.0 https://docs.travis-ci.com/user/ci-environment/ -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-6399) Update default Hadoop download version
Greg Hogan created FLINK-6399: - Summary: Update default Hadoop download version Key: FLINK-6399 URL: https://issues.apache.org/jira/browse/FLINK-6399 Project: Flink Issue Type: Bug Components: Project Website Reporter: Greg Hogan [Update|http://flink.apache.org/downloads.html] "If you don’t want to do this, pick the Hadoop 1 version." since Hadoop 1 versions are no longer provided. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-6375) Fix LongValue hashCode
Greg Hogan created FLINK-6375: - Summary: Fix LongValue hashCode Key: FLINK-6375 URL: https://issues.apache.org/jira/browse/FLINK-6375 Project: Flink Issue Type: Improvement Components: Core Affects Versions: 2.0.0 Reporter: Greg Hogan Priority: Trivial Match {{LongValue.hashCode}} to {{Long.hashCode}} (and the other numeric types) by simply adding the high and low words rather than shifting the hash by adding 43. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-6414) Use scala.binary.version in place of change-scala-version.sh
Greg Hogan created FLINK-6414: - Summary: Use scala.binary.version in place of change-scala-version.sh Key: FLINK-6414 URL: https://issues.apache.org/jira/browse/FLINK-6414 Project: Flink Issue Type: Improvement Components: Build System Affects Versions: 1.3.0 Reporter: Greg Hogan Assignee: Greg Hogan Recent commits have failed to modify {{change-scala-version.sh}} resulting in broken builds for {{scala-2.11}}. It looks like we can remove the need for this script by replacing hard-coded references to the Scala version with Flink's maven variable {{scala.binary.version}}. I had initially realized that the change script is [only used for building|https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/building.html#scala-versions] and not for switching the IDE environment. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-6382) Support all numeric types for generated graphs in Gelly examples
Greg Hogan created FLINK-6382: - Summary: Support all numeric types for generated graphs in Gelly examples Key: FLINK-6382 URL: https://issues.apache.org/jira/browse/FLINK-6382 Project: Flink Issue Type: Improvement Components: Gelly Affects Versions: 1.3.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor Fix For: 1.3.0 The Gelly examples current support {{IntValue}}, {{LongValue}}, and {{StringValue}} for {{RMatGraph}}. Allow transformations and tests for all generated graphs for {{ByteValue}}, {{Byte}}, {{ShortValue}}, {{Short}}, {{CharValue}}, {{Character}}, {{Integer}}, {{Long}}, and {{String}}. This is additionally of interest for benchmarking and testing modifications to Flink's internal sort. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-6560) Restore maven parallelism in flink-tests
Greg Hogan created FLINK-6560: - Summary: Restore maven parallelism in flink-tests Key: FLINK-6560 URL: https://issues.apache.org/jira/browse/FLINK-6560 Project: Flink Issue Type: Bug Components: Build System Affects Versions: 1.3.0, 1.4.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor Fix For: 1.3.0, 1.4.0 FLINK-6506 added the maven variable {{flink.forkCountTestPackage}} which is used by the TravisCI script but no default value is set. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-6616) Clarify provenance of official Docker images
Greg Hogan created FLINK-6616: - Summary: Clarify provenance of official Docker images Key: FLINK-6616 URL: https://issues.apache.org/jira/browse/FLINK-6616 Project: Flink Issue Type: Improvement Components: Documentation Affects Versions: 1.3.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Critical Fix For: 1.3.0 Note that the official Docker images for Flink are community supported and not an official release of the Apache Flink PMC. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-6603) Enable checkstyle on test sources
Greg Hogan created FLINK-6603: - Summary: Enable checkstyle on test sources Key: FLINK-6603 URL: https://issues.apache.org/jira/browse/FLINK-6603 Project: Flink Issue Type: Improvement Components: Streaming Affects Versions: 1.4.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Trivial Fix For: 1.4.0 With the addition of strict checkstyle to select modules (currently limited to {{flink-streaming-java}}) we can enable the checkstyle flag {{includeTestSourceDirectory}} to perform the same unused imports, whitespace, and other checks on test sources. Should first resolve the import grouping as discussed in FLINK-6107. Also, several tests exceed the 2500 line limit. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-6882) Activate checkstyle for runtime/registration
Greg Hogan created FLINK-6882: - Summary: Activate checkstyle for runtime/registration Key: FLINK-6882 URL: https://issues.apache.org/jira/browse/FLINK-6882 Project: Flink Issue Type: Improvement Components: Local Runtime Affects Versions: 1.4.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Trivial Fix For: 1.4.0 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-6903) Activate checkstyle for runtime/akka
Greg Hogan created FLINK-6903: - Summary: Activate checkstyle for runtime/akka Key: FLINK-6903 URL: https://issues.apache.org/jira/browse/FLINK-6903 Project: Flink Issue Type: Improvement Components: Local Runtime Affects Versions: 1.4.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Trivial Fix For: 1.4.0 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-6986) Broken links to Photoshop images
Greg Hogan created FLINK-6986: - Summary: Broken links to Photoshop images Key: FLINK-6986 URL: https://issues.apache.org/jira/browse/FLINK-6986 Project: Flink Issue Type: Bug Components: Project Website Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor The "Black outline logo with text" links on the [community|https://flink.apache.org/community.html] page are broken. I'd like to see if we can find a comprehensive solution for broken links. I only noticed this due to random clicking. I think Google can report broken links or we could run our own scan. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-6989) Refactor examples with Output interface
Greg Hogan created FLINK-6989: - Summary: Refactor examples with Output interface Key: FLINK-6989 URL: https://issues.apache.org/jira/browse/FLINK-6989 Project: Flink Issue Type: Sub-task Components: Gelly Affects Versions: 1.4.0 Reporter: Greg Hogan Assignee: Greg Hogan Fix For: 1.4.0 The current organization of the Gelly examples retains full flexibility by handling the Graph input to the algorithm Driver and having the Driver overload interfaces for the various output types. The outputs must be made independent in order to support Transforms which are applied between the Driver and Output (and also between the Input and Driver). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7006) Base class using POJOs for Gelly algorithms
Greg Hogan created FLINK-7006: - Summary: Base class using POJOs for Gelly algorithms Key: FLINK-7006 URL: https://issues.apache.org/jira/browse/FLINK-7006 Project: Flink Issue Type: Sub-task Components: Gelly Affects Versions: 1.4.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor Fix For: 1.4.0 Gelly algorithms commonly have a {{Result}} class extending a {{Tuple}} type and implementing one of the {{Unary/Binary/TertiaryResult}} interfaces. Add a {{Unary/Binary/TertiaryResultBase}} class implementing each interface and convert the {{Result}} classes to POJOs extending the base result classes. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-7019) Rework parallelism in Gelly algorithms and examples
Greg Hogan created FLINK-7019: - Summary: Rework parallelism in Gelly algorithms and examples Key: FLINK-7019 URL: https://issues.apache.org/jira/browse/FLINK-7019 Project: Flink Issue Type: Sub-task Components: Gelly Affects Versions: 1.4.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor Fix For: 1.4.0 Flink job parallelism is set with {{ExecutionConfig#setParallelism}} or when {{-p}} on the command-line. The Gelly algorithms {{JaccardIndex}}, {{AdamicAdar}}, {{TriangleListing}}, and {{ClusteringCoefficient}} have intermediate operators which generated output quadratic in the size of input. These algorithms may need to be run with a high parallelism but doing so for all operations is wasteful. Thus was introduced "little parallelism". This can be simplified by moving the parallelism parameter to the new common base class and with the rule-of-thumb to use the algorithm parallelism for all normal (small output) operators. The asymptotically large operators will default to the job parallelism, as will the default algorithm parallelism. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-6707) Activate strict checkstyle for flink-examples
Greg Hogan created FLINK-6707: - Summary: Activate strict checkstyle for flink-examples Key: FLINK-6707 URL: https://issues.apache.org/jira/browse/FLINK-6707 Project: Flink Issue Type: Sub-task Components: Examples Affects Versions: 1.4.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Trivial Fix For: 1.4.0 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-6709) Activate strict checkstyle for flink-gellies
Greg Hogan created FLINK-6709: - Summary: Activate strict checkstyle for flink-gellies Key: FLINK-6709 URL: https://issues.apache.org/jira/browse/FLINK-6709 Project: Flink Issue Type: Sub-task Components: Gelly Affects Versions: 1.4.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Trivial Fix For: 1.4.0 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-6648) Transforms for Gelly examples
Greg Hogan created FLINK-6648: - Summary: Transforms for Gelly examples Key: FLINK-6648 URL: https://issues.apache.org/jira/browse/FLINK-6648 Project: Flink Issue Type: Improvement Components: Gelly Affects Versions: 1.4.0 Reporter: Greg Hogan Assignee: Greg Hogan Fix For: 1.4.0 A primary objective of the Gelly examples {{Runner}} is to make adding new inputs and algorithms as simple and powerful as possible. A recent feature made it possible to translate the key ID of generated graphs to alternative numeric or string representations. For floating point and {{LongValue}} it is desirable to translate the key ID of the algorithm results. Currently a {{Runner}} job consists of an input, an algorithm, and an output. A {{Transform}} will translate the input {{Graph}} and the algorithm output {{DataSet}}. The {{Input}} and algorithm {{Driver}} will return an ordered list of {{Transform}} which will be executed in that order (processed in reverse order for algorithm output) . The {{Transform}} can be configured as are inputs and drivers. Example transforms: - the aforementioned translation of key ID types - surrogate types (String -> Long or Int) for user data - FLINK-4481 Maximum results for pairwise algorithms - FLINK-3625 Graph algorithms to permute graph labels and edges -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-6779) Activate strict checkstyle in flink-scala
Greg Hogan created FLINK-6779: - Summary: Activate strict checkstyle in flink-scala Key: FLINK-6779 URL: https://issues.apache.org/jira/browse/FLINK-6779 Project: Flink Issue Type: Sub-task Components: Scala API Affects Versions: 1.4.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Trivial -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-6872) Add MissingOverride to checkstyle
Greg Hogan created FLINK-6872: - Summary: Add MissingOverride to checkstyle Key: FLINK-6872 URL: https://issues.apache.org/jira/browse/FLINK-6872 Project: Flink Issue Type: New Feature Affects Versions: 1.4.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor [Verifies|http://checkstyle.sourceforge.net/config_annotation.html#MissingOverride] that the java.lang.Override annotation is present when the @inheritDoc javadoc tag is present. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-6877) Activate checkstyle for runtime/security
Greg Hogan created FLINK-6877: - Summary: Activate checkstyle for runtime/security Key: FLINK-6877 URL: https://issues.apache.org/jira/browse/FLINK-6877 Project: Flink Issue Type: Improvement Components: Local Runtime Affects Versions: 1.4.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Trivial Fix For: 1.4.0 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-6878) Activate checkstyle for runtime/query
Greg Hogan created FLINK-6878: - Summary: Activate checkstyle for runtime/query Key: FLINK-6878 URL: https://issues.apache.org/jira/browse/FLINK-6878 Project: Flink Issue Type: Improvement Components: Local Runtime Affects Versions: 1.4.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Trivial Fix For: 1.4.0 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-6880) Activate checkstyle for runtime/iterative
Greg Hogan created FLINK-6880: - Summary: Activate checkstyle for runtime/iterative Key: FLINK-6880 URL: https://issues.apache.org/jira/browse/FLINK-6880 Project: Flink Issue Type: Improvement Components: Local Runtime Affects Versions: 1.4.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Trivial Fix For: 1.4.0 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-6879) Activate checkstyle for runtime/memory
Greg Hogan created FLINK-6879: - Summary: Activate checkstyle for runtime/memory Key: FLINK-6879 URL: https://issues.apache.org/jira/browse/FLINK-6879 Project: Flink Issue Type: Improvement Components: Local Runtime Affects Versions: 1.4.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Trivial Fix For: 1.4.0 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-6466) Build Hadoop 2.8.0 convenience binaries
Greg Hogan created FLINK-6466: - Summary: Build Hadoop 2.8.0 convenience binaries Key: FLINK-6466 URL: https://issues.apache.org/jira/browse/FLINK-6466 Project: Flink Issue Type: New Feature Components: Build System Affects Versions: 1.3.0 Reporter: Greg Hogan Assignee: Greg Hogan Fix For: 1.3.0 As discussed on the dev mailing list, add Hadoop 2.8 to the {{create_release_files.sh}} script and TravisCI test matrix. If there is consensus then references to binaries for old versions of Hadoop could be removed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-6777) Activate strict checkstyle for flink-scala-shell
Greg Hogan created FLINK-6777: - Summary: Activate strict checkstyle for flink-scala-shell Key: FLINK-6777 URL: https://issues.apache.org/jira/browse/FLINK-6777 Project: Flink Issue Type: Sub-task Components: Scala Shell Affects Versions: 1.4.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Trivial -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-6778) Activate strict checkstyle for flink-dist
Greg Hogan created FLINK-6778: - Summary: Activate strict checkstyle for flink-dist Key: FLINK-6778 URL: https://issues.apache.org/jira/browse/FLINK-6778 Project: Flink Issue Type: Sub-task Affects Versions: 1.4.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Trivial -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-7023) Remaining types for Gelly ValueArrays
Greg Hogan created FLINK-7023: - Summary: Remaining types for Gelly ValueArrays Key: FLINK-7023 URL: https://issues.apache.org/jira/browse/FLINK-7023 Project: Flink Issue Type: Sub-task Components: Gelly Affects Versions: 1.4.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Trivial Fix For: 1.4.0 Add implementations of Byte/Char/Double/Float/ShortValueArray. Along with the existing implementations of Int/Long/Null/StringValueArray this covers all 10 CopyableValue types. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-8180) Refactor driver outputs
Greg Hogan created FLINK-8180: - Summary: Refactor driver outputs Key: FLINK-8180 URL: https://issues.apache.org/jira/browse/FLINK-8180 Project: Flink Issue Type: Improvement Components: Gelly Affects Versions: 1.5.0 Reporter: Greg Hogan Assignee: Greg Hogan Fix For: 1.5.0 The change in 1.4 of algorithm results from Tuples to POJOs broke the writing of results as csv. Testing this was and is a challenge so was not done. There are many additional improvements which can be made based on recent improvements to the Gelly framework. Result hash and analytic results should always be printed to the screen. Results can optionally be written to stdout or to a file. In the latter case the result hash and analytic results (and schema) will also be written to a top-level file. The "verbose" output strings can be replaced with json which is just as human-readable but also machine readable. In addition to csv and json it may be simple to support xml, etc. Computed fields will be optionally printed to screen or file (currently these are always printed to screen but never to file). Testing will be simplified since formats are now a separate concern from the stream. Jackson is available to Gelly as a dependency provided in the Flink distribution but we may want to build Gelly as a fat jar in order to include additional modules (which may require a direct dependency on Jackson, which would fail the checkstyle requirement to use the shaded package). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-8126) Update and fix checkstyle
Greg Hogan created FLINK-8126: - Summary: Update and fix checkstyle Key: FLINK-8126 URL: https://issues.apache.org/jira/browse/FLINK-8126 Project: Flink Issue Type: Bug Affects Versions: 1.5.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Trivial Fix For: 1.5.0 Our current checkstyle configuration (checkstyle version 6.19) is missing some ImportOrder and variable naming errors which are detected in 1) IntelliJ using the same checkstyle version and 2) with the maven-checkstyle-plugin with an up-to-date checkstyle version (8.4). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-8223) Update Hadoop versions
Greg Hogan created FLINK-8223: - Summary: Update Hadoop versions Key: FLINK-8223 URL: https://issues.apache.org/jira/browse/FLINK-8223 Project: Flink Issue Type: Improvement Components: Build System Affects Versions: 1.5.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Trivial Update 2.7.3 to 2.7.4 and 2.8.0 to 2.8.2. See http://hadoop.apache.org/releases.html -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-8222) Update Scala version
Greg Hogan created FLINK-8222: - Summary: Update Scala version Key: FLINK-8222 URL: https://issues.apache.org/jira/browse/FLINK-8222 Project: Flink Issue Type: Improvement Components: Build System Affects Versions: 1.4.0 Reporter: Greg Hogan Assignee: Greg Hogan Update Scala to version {{2.11.12}}. I don't believe this affects the Flink distribution but rather anyone who is compiling Flink or a Flink-quickstart-derived program on a shared system. "A privilege escalation vulnerability (CVE-2017-15288) has been identified in the Scala compilation daemon." https://www.scala-lang.org/news/security-update-nov17.html -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-8363) Build Hadoop 2.9.0 convenience binaries
Greg Hogan created FLINK-8363: - Summary: Build Hadoop 2.9.0 convenience binaries Key: FLINK-8363 URL: https://issues.apache.org/jira/browse/FLINK-8363 Project: Flink Issue Type: New Feature Components: Build System Affects Versions: 1.5.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Trivial Hadoop 2.9.0 was released on 17 November, 2017. A local {{mvn clean verify -Dhadoop.version=2.9.0}} ran successfully. With the new Hadoopless build we may be able to improve the build process by reusing the {{flink-dist}} jar (which differ only in build timestamps) and simply make each Hadoop-specific tarball by copying in the corresponding {{flink-shaded-hadoop2-uber}} jar. What portion of the TravisCI jobs can run Hadoopless? We could build and verify these once and then run a Hadoop-versioned job for each Hadoop version. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-8422) Checkstyle for org.apache.flink.api.java.tuple
Greg Hogan created FLINK-8422: - Summary: Checkstyle for org.apache.flink.api.java.tuple Key: FLINK-8422 URL: https://issues.apache.org/jira/browse/FLINK-8422 Project: Flink Issue Type: Improvement Components: Core Affects Versions: 1.5.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Trivial Update {{TupleGenerator}} for Flink's checkstyle and rebuild {{Tuple}} and {{TupleBuilder}} classes. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-8427) Checkstyle for org.apache.flink.optimizer.costs
Greg Hogan created FLINK-8427: - Summary: Checkstyle for org.apache.flink.optimizer.costs Key: FLINK-8427 URL: https://issues.apache.org/jira/browse/FLINK-8427 Project: Flink Issue Type: Improvement Components: Optimizer Affects Versions: 1.5.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Trivial -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (FLINK-8361) Remove create_release_files.sh
Greg Hogan created FLINK-8361: - Summary: Remove create_release_files.sh Key: FLINK-8361 URL: https://issues.apache.org/jira/browse/FLINK-8361 Project: Flink Issue Type: Improvement Components: Build System Affects Versions: 1.5.0 Reporter: Greg Hogan Priority: Trivial The monolithic {{create_release_files.sh}} does not support building Flink without Hadoop and looks to have been superseded by the scripts in {{tools/releasing}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)