[ 
https://issues.apache.org/jira/browse/FLINK-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402759#comment-15402759
 ] 

ASF GitHub Bot commented on FLINK-2090:
---------------------------------------

GitHub user mushketyk opened a pull request:

    https://github.com/apache/flink/pull/2323

    [FLINK-2090] toString of CollectionInputFormat takes long time when t…

    Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
    If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
    In addition to going through the list, please provide a meaningful 
description of your changes.
    
    - [x] General
      - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
      - The pull request addresses only one issue
      - Each commit in the PR has a meaningful commit message (including the 
JIRA id)
    
    - [x] Documentation
      - Documentation has been added for new functionality
      - Old documentation affected by the pull request has been updated
      - JavaDoc for public methods has been added
    
    - [x] Tests & Build
      - Functionality added by the pull request is covered by tests
      - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed
    
    …he collection is huge

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mushketyk/flink fast-to-string

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/2323.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2323
    
----
commit 76c5b7dd1cf12b17b7601b2d1c8ea7cc475a031c
Author: Ivan Mushketyk <[email protected]>
Date:   2016-08-01T19:39:17Z

    [FLINK-2090] toString of CollectionInputFormat takes long time when the 
collection is huge

----


> toString of CollectionInputFormat takes long time when the collection is huge
> -----------------------------------------------------------------------------
>
>                 Key: FLINK-2090
>                 URL: https://issues.apache.org/jira/browse/FLINK-2090
>             Project: Flink
>          Issue Type: Improvement
>            Reporter: Till Rohrmann
>            Assignee: Ivan Mushketyk
>            Priority: Minor
>
> The {{toString}} method of {{CollectionInputFormat}} calls {{toString}} on 
> its underlying {{Collection}}. Thus, {{toString}} is called for each element 
> of the collection. If the {{Collection}} contains many elements or the 
> individual {{toString}} calls for each element take a long time, then the 
> string generation can take a considerable amount of time. [~mikiobraun] 
> noticed that when he inserted several jBLAS matrices into Flink.
> The {{toString}} method is mainly used for logging statements in 
> {{DataSourceNode}}'s {{computeOperatorSpecificDefaultEstimates}} method and 
> in {{JobGraphGenerator.getDescriptionForUserCode}}. I'm wondering whether it 
> is necessary to print the complete content of the underlying {{Collection}} 
> or if it's not enough to print only the first 3 elements in the {{toString}} 
> method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to