[jira] [Commented] (FLINK-10203) Support truncate method for old Hadoop versions in HadoopRecoverableFsDataOutputStream

2021-04-27 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-10203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17334103#comment-17334103
 ] 

Flink Jira Bot commented on FLINK-10203:


This issue was marked "stale-assigned" and has not received an update in 7 
days. It is now automatically unassigned. If you are still working on it, you 
can assign it to yourself again. Please also give an update about the status of 
the work.

> Support truncate method for old Hadoop versions in 
> HadoopRecoverableFsDataOutputStream
> --
>
> Key: FLINK-10203
> URL: https://issues.apache.org/jira/browse/FLINK-10203
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream, Connectors / FileSystem
>Affects Versions: 1.6.0, 1.6.1, 1.7.0
>Reporter: Artsem Semianenka
>Assignee: Artsem Semianenka
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Attachments: legacy truncate logic.pdf
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> New StreamingFileSink ( introduced in 1.6 Flink version ) use 
> HadoopRecoverableFsDataOutputStream wrapper to write data in HDFS.
> HadoopRecoverableFsDataOutputStream is a wrapper for FSDataOutputStream to 
> have an ability to restore from a certain point of the file after failure and 
> continue to write data. To achieve this recovery functionality the 
> HadoopRecoverableFsDataOutputStream uses "truncate" method which was 
> introduced only in Hadoop 2.7. 
> FLINK-14170 has enabled the usage of StreamingFileSink for 
> OnCheckpointRollingPolicy, but it is still not possible to use 
> StreamingFileSink with DefaultRollingPolicy, which makes writing of the data 
> to HDFS unpractical in scale for HDFS < 2.7.
> Unfortunately, there are a few official Hadoop distributives which latest 
> version still use Hadoop 2.6 (This distributives: Cloudera, Pivotal HD ). As 
> the result Flinks Hadoop connector can't work with this distributives.
> Flink declares that supported Hadoop from version 2.4.0 upwards 
> ([https://ci.apache.org/projects/flink/flink-docs-release-1.6/start/building.html#hadoop-versions])
> I guess we should emulate the functionality of "truncate" method for older 
> Hadoop versions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-10203) Support truncate method for old Hadoop versions in HadoopRecoverableFsDataOutputStream

2021-04-16 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-10203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17323477#comment-17323477
 ] 

Flink Jira Bot commented on FLINK-10203:


This issue is assigned but has not received an update in 7 days so it has been 
labeled "stale-assigned". If you are still working on the issue, please give an 
update and remove the label. If you are no longer working on the issue, please 
unassign so someone else may work on it. In 7 days the issue will be 
automatically unassigned.

> Support truncate method for old Hadoop versions in 
> HadoopRecoverableFsDataOutputStream
> --
>
> Key: FLINK-10203
> URL: https://issues.apache.org/jira/browse/FLINK-10203
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream, Connectors / FileSystem
>Affects Versions: 1.6.0, 1.6.1, 1.7.0
>Reporter: Artsem Semianenka
>Assignee: Artsem Semianenka
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Attachments: legacy truncate logic.pdf
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> New StreamingFileSink ( introduced in 1.6 Flink version ) use 
> HadoopRecoverableFsDataOutputStream wrapper to write data in HDFS.
> HadoopRecoverableFsDataOutputStream is a wrapper for FSDataOutputStream to 
> have an ability to restore from a certain point of the file after failure and 
> continue to write data. To achieve this recovery functionality the 
> HadoopRecoverableFsDataOutputStream uses "truncate" method which was 
> introduced only in Hadoop 2.7. 
> FLINK-14170 has enabled the usage of StreamingFileSink for 
> OnCheckpointRollingPolicy, but it is still not possible to use 
> StreamingFileSink with DefaultRollingPolicy, which makes writing of the data 
> to HDFS unpractical in scale for HDFS < 2.7.
> Unfortunately, there are a few official Hadoop distributives which latest 
> version still use Hadoop 2.6 (This distributives: Cloudera, Pivotal HD ). As 
> the result Flinks Hadoop connector can't work with this distributives.
> Flink declares that supported Hadoop from version 2.4.0 upwards 
> ([https://ci.apache.org/projects/flink/flink-docs-release-1.6/start/building.html#hadoop-versions])
> I guess we should emulate the functionality of "truncate" method for older 
> Hadoop versions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-10203) Support truncate method for old Hadoop versions in HadoopRecoverableFsDataOutputStream

2018-11-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16684888#comment-16684888
 ] 

ASF GitHub Bot commented on FLINK-10203:


kl0u commented on issue #6608: [FLINK-10203]Support truncate method for old 
Hadoop versions in HadoopRecoverableFsDataOutputStream
URL: https://github.com/apache/flink/pull/6608#issuecomment-438182663
 
 
   Hi @art4ul ! There is already an open PR for that issue and it is this one:
   https://github.com/apache/flink/pull/7047
   you can also review it if you want so that it can get merged faster. From 
the name of the PR you can also find the issue number.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support truncate method for old Hadoop versions in 
> HadoopRecoverableFsDataOutputStream
> --
>
> Key: FLINK-10203
> URL: https://issues.apache.org/jira/browse/FLINK-10203
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API, filesystem-connector
>Affects Versions: 1.6.0, 1.6.1, 1.7.0
>Reporter: Artsem Semianenka
>Assignee: Artsem Semianenka
>Priority: Major
>  Labels: pull-request-available
> Attachments: legacy truncate logic.pdf
>
>
> New StreamingFileSink ( introduced in 1.6 Flink version ) use 
> HadoopRecoverableFsDataOutputStream wrapper to write data in HDFS.
> HadoopRecoverableFsDataOutputStream is a wrapper for FSDataOutputStream to 
> have an ability to restore from certain point of file after failure and 
> continue write data. To achieve this recover functionality the 
> HadoopRecoverableFsDataOutputStream use "truncate" method which was 
> introduced only in Hadoop 2.7 .
> Unfortunately there are a few official Hadoop distributive which latest 
> version still use Hadoop 2.6 (This distributives: Cloudera, Pivotal HD ). As 
> the result Flinks Hadoop connector can't work with this distributives.
> Flink declares that supported Hadoop from version 2.4.0 upwards 
> ([https://ci.apache.org/projects/flink/flink-docs-release-1.6/start/building.html#hadoop-versions])
> I guess we should emulate the functionality of "truncate" method for older 
> Hadoop versions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10203) Support truncate method for old Hadoop versions in HadoopRecoverableFsDataOutputStream

2018-11-12 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683556#comment-16683556
 ] 

ASF GitHub Bot commented on FLINK-10203:


art4ul commented on issue #6608: [FLINK-10203]Support truncate method for old 
Hadoop versions in HadoopRecoverableFsDataOutputStream
URL: https://github.com/apache/flink/pull/6608#issuecomment-437836407
 
 
   Hi @kl0u , maybe could you share your ticket regarding the implementation of 
logic to support supportsResume() method?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support truncate method for old Hadoop versions in 
> HadoopRecoverableFsDataOutputStream
> --
>
> Key: FLINK-10203
> URL: https://issues.apache.org/jira/browse/FLINK-10203
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API, filesystem-connector
>Affects Versions: 1.6.0, 1.6.1, 1.7.0
>Reporter: Artsem Semianenka
>Assignee: Artsem Semianenka
>Priority: Major
>  Labels: pull-request-available
> Attachments: legacy truncate logic.pdf
>
>
> New StreamingFileSink ( introduced in 1.6 Flink version ) use 
> HadoopRecoverableFsDataOutputStream wrapper to write data in HDFS.
> HadoopRecoverableFsDataOutputStream is a wrapper for FSDataOutputStream to 
> have an ability to restore from certain point of file after failure and 
> continue write data. To achieve this recover functionality the 
> HadoopRecoverableFsDataOutputStream use "truncate" method which was 
> introduced only in Hadoop 2.7 .
> Unfortunately there are a few official Hadoop distributive which latest 
> version still use Hadoop 2.6 (This distributives: Cloudera, Pivotal HD ). As 
> the result Flinks Hadoop connector can't work with this distributives.
> Flink declares that supported Hadoop from version 2.4.0 upwards 
> ([https://ci.apache.org/projects/flink/flink-docs-release-1.6/start/building.html#hadoop-versions])
> I guess we should emulate the functionality of "truncate" method for older 
> Hadoop versions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10203) Support truncate method for old Hadoop versions in HadoopRecoverableFsDataOutputStream

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670237#comment-16670237
 ] 

ASF GitHub Bot commented on FLINK-10203:


art4ul commented on issue #6608: [FLINK-10203]Support truncate method for old 
Hadoop versions in HadoopRecoverableFsDataOutputStream
URL: https://github.com/apache/flink/pull/6608#issuecomment-434730831
 
 
   @kl0u @StephanEwen 
   Hi guys, 
   Regarding your question:
   > - Does HDFS permit to rename to an already existing file name (replacing 
that existing file)?
   
   I've double checked it. HDFS has no ability to move file with overwriting. 
But this Pull request resolves this issue. 
   
In case of failure after restarting the ‘truncate’ method check if the 
original file exists:
   -   If the original file exists - start the process from the beginning.
   -   If original file not exists but exists the file with '*.truncated' 
extension. The absence of original file tells us about that truncated file was 
written fully and deleted . The source crushed on the stage of renaming the 
truncated file. We can use file with '*.truncated' extension as a resultant 
file and finish the truncation process.
   
   Also, I would like to clarify your idea regarding recoverable writer with 
"Recover for resume" property.
   As far as I understand in this approach if Hadoop version greater than 2.7 
we going to instantiate recoverable writer with native Hadoop truncate logic 
and method supportsResume() should return true. Otherwise, we instantiate 
recoverable writer which never use truncate method (only create new files) and 
the method supportsResume() should return false.
   
   If you ok with this approach I can prepare another pull request. But in this 
case, I need to wait when the logic which check the supportsResume method will 
be implemented.
   Maybe I could help you with it ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support truncate method for old Hadoop versions in 
> HadoopRecoverableFsDataOutputStream
> --
>
> Key: FLINK-10203
> URL: https://issues.apache.org/jira/browse/FLINK-10203
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API, filesystem-connector
>Affects Versions: 1.6.0, 1.6.1, 1.7.0
>Reporter: Artsem Semianenka
>Assignee: Artsem Semianenka
>Priority: Major
>  Labels: pull-request-available
> Attachments: legacy truncate logic.pdf
>
>
> New StreamingFileSink ( introduced in 1.6 Flink version ) use 
> HadoopRecoverableFsDataOutputStream wrapper to write data in HDFS.
> HadoopRecoverableFsDataOutputStream is a wrapper for FSDataOutputStream to 
> have an ability to restore from certain point of file after failure and 
> continue write data. To achieve this recover functionality the 
> HadoopRecoverableFsDataOutputStream use "truncate" method which was 
> introduced only in Hadoop 2.7 .
> Unfortunately there are a few official Hadoop distributive which latest 
> version still use Hadoop 2.6 (This distributives: Cloudera, Pivotal HD ). As 
> the result Flinks Hadoop connector can't work with this distributives.
> Flink declares that supported Hadoop from version 2.4.0 upwards 
> ([https://ci.apache.org/projects/flink/flink-docs-release-1.6/start/building.html#hadoop-versions])
> I guess we should emulate the functionality of "truncate" method for older 
> Hadoop versions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10203) Support truncate method for old Hadoop versions in HadoopRecoverableFsDataOutputStream

2018-10-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669135#comment-16669135
 ] 

ASF GitHub Bot commented on FLINK-10203:


kl0u commented on issue #6608: [FLINK-10203]Support truncate method for old 
Hadoop versions in HadoopRecoverableFsDataOutputStream
URL: https://github.com/apache/flink/pull/6608#issuecomment-434409845
 
 
   Hi Artsem, you are correct that it is not used but I already have a branch
   for it and there is an open Jira for that that I have assigned to myself.
   
   On Tue, Oct 30, 2018, 17:01 Artsem Semianenka 
   wrote:
   
   > @StephanEwen  I really like your idea
   > regarding recoverable writer with "Recover for resume" property. I found
   > the method which you are talking about:
   > boolean supportsResume()
   > in RecoverableWriterm
   > 

   > interface, but as far as I can see this method is not using in the Flink
   > project.
   > Using search by the whole project I found only implementation of this
   > method, but no one invocation of the method
   > 
https://github.com/apache/flink/search?q=supportsResume_q=supportsResume
   >
   > —
   > You are receiving this because you are subscribed to this thread.
   > Reply to this email directly, view it on GitHub
   > , or mute
   > the thread
   > 

   > .
   >
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support truncate method for old Hadoop versions in 
> HadoopRecoverableFsDataOutputStream
> --
>
> Key: FLINK-10203
> URL: https://issues.apache.org/jira/browse/FLINK-10203
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API, filesystem-connector
>Affects Versions: 1.6.0, 1.6.1, 1.7.0
>Reporter: Artsem Semianenka
>Assignee: Artsem Semianenka
>Priority: Major
>  Labels: pull-request-available
> Attachments: legacy truncate logic.pdf
>
>
> New StreamingFileSink ( introduced in 1.6 Flink version ) use 
> HadoopRecoverableFsDataOutputStream wrapper to write data in HDFS.
> HadoopRecoverableFsDataOutputStream is a wrapper for FSDataOutputStream to 
> have an ability to restore from certain point of file after failure and 
> continue write data. To achieve this recover functionality the 
> HadoopRecoverableFsDataOutputStream use "truncate" method which was 
> introduced only in Hadoop 2.7 .
> Unfortunately there are a few official Hadoop distributive which latest 
> version still use Hadoop 2.6 (This distributives: Cloudera, Pivotal HD ). As 
> the result Flinks Hadoop connector can't work with this distributives.
> Flink declares that supported Hadoop from version 2.4.0 upwards 
> ([https://ci.apache.org/projects/flink/flink-docs-release-1.6/start/building.html#hadoop-versions])
> I guess we should emulate the functionality of "truncate" method for older 
> Hadoop versions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10203) Support truncate method for old Hadoop versions in HadoopRecoverableFsDataOutputStream

2018-10-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16668928#comment-16668928
 ] 

ASF GitHub Bot commented on FLINK-10203:


art4ul commented on issue #6608: [FLINK-10203]Support truncate method for old 
Hadoop versions in HadoopRecoverableFsDataOutputStream
URL: https://github.com/apache/flink/pull/6608#issuecomment-434359231
 
 
   @StephanEwen I really like your idea regarding recoverable writer with 
"Recover for resume" property. I found the method which you are talking about:
   ```boolean supportsResume()```
   in 
[RecoverableWriterm](https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/fs/RecoverableWriter.java)
 interface, but as far as I can see this method is not using in the Flink 
project. 
   Using search by the whole project I found only implementation of this 
method, but no one invocation of the method 
https://github.com/apache/flink/search?q=supportsResume_q=supportsResume


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support truncate method for old Hadoop versions in 
> HadoopRecoverableFsDataOutputStream
> --
>
> Key: FLINK-10203
> URL: https://issues.apache.org/jira/browse/FLINK-10203
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API, filesystem-connector
>Affects Versions: 1.6.0, 1.6.1, 1.7.0
>Reporter: Artsem Semianenka
>Assignee: Artsem Semianenka
>Priority: Major
>  Labels: pull-request-available
> Attachments: legacy truncate logic.pdf
>
>
> New StreamingFileSink ( introduced in 1.6 Flink version ) use 
> HadoopRecoverableFsDataOutputStream wrapper to write data in HDFS.
> HadoopRecoverableFsDataOutputStream is a wrapper for FSDataOutputStream to 
> have an ability to restore from certain point of file after failure and 
> continue write data. To achieve this recover functionality the 
> HadoopRecoverableFsDataOutputStream use "truncate" method which was 
> introduced only in Hadoop 2.7 .
> Unfortunately there are a few official Hadoop distributive which latest 
> version still use Hadoop 2.6 (This distributives: Cloudera, Pivotal HD ). As 
> the result Flinks Hadoop connector can't work with this distributives.
> Flink declares that supported Hadoop from version 2.4.0 upwards 
> ([https://ci.apache.org/projects/flink/flink-docs-release-1.6/start/building.html#hadoop-versions])
> I guess we should emulate the functionality of "truncate" method for older 
> Hadoop versions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10203) Support truncate method for old Hadoop versions in HadoopRecoverableFsDataOutputStream

2018-09-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610310#comment-16610310
 ] 

ASF GitHub Bot commented on FLINK-10203:


StephanEwen commented on issue #6608: [FLINK-10203]Support truncate method for 
old Hadoop versions in HadoopRecoverableFsDataOutputStream
URL: https://github.com/apache/flink/pull/6608#issuecomment-420199851
 
 
   We were of at the Flink Forward conference the last week, so slow progress 
on PRs...
   
   In principle, the design looks fine. To double check:
 - Does HDFS permit to rename to an already existing file name (replacing 
that existing file)?
 - If yes, this sounds reasonable. If not, this could be an issue (need to 
delete the original file before rename, failure in that case means file does 
not exist on recovery).
   
   There is another option to approach this:
 - The recoverable writer has the option to say if it can also "recover for 
resume" or only "recover for commit". "Recover for resume" leads to appending 
to the started file, while only supporting "recover for commit" means that a 
new part file would be started after recovery.
 - This version could declare itself to only "recover for commit", in which 
case we would never have to go back to the original file name, but only copy 
from the "part in progress"-file to the published file name, avoiding the above 
problem.
 - That would mean we need to have the "truncater" handle the "truncate 
existing file back" logic and "truncating rename". The legacy hadoop handler 
would only implement the second - truncating rename.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support truncate method for old Hadoop versions in 
> HadoopRecoverableFsDataOutputStream
> --
>
> Key: FLINK-10203
> URL: https://issues.apache.org/jira/browse/FLINK-10203
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API, filesystem-connector
>Affects Versions: 1.6.0, 1.6.1, 1.7.0
>Reporter: Artsem Semianenka
>Assignee: Artsem Semianenka
>Priority: Major
>  Labels: pull-request-available
> Attachments: legacy truncate logic.pdf
>
>
> New StreamingFileSink ( introduced in 1.6 Flink version ) use 
> HadoopRecoverableFsDataOutputStream wrapper to write data in HDFS.
> HadoopRecoverableFsDataOutputStream is a wrapper for FSDataOutputStream to 
> have an ability to restore from certain point of file after failure and 
> continue write data. To achieve this recover functionality the 
> HadoopRecoverableFsDataOutputStream use "truncate" method which was 
> introduced only in Hadoop 2.7 .
> Unfortunately there are a few official Hadoop distributive which latest 
> version still use Hadoop 2.6 (This distributives: Cloudera, Pivotal HD ). As 
> the result Flinks Hadoop connector can't work with this distributives.
> Flink declares that supported Hadoop from version 2.4.0 upwards 
> ([https://ci.apache.org/projects/flink/flink-docs-release-1.6/start/building.html#hadoop-versions])
> I guess we should emulate the functionality of "truncate" method for older 
> Hadoop versions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10203) Support truncate method for old Hadoop versions in HadoopRecoverableFsDataOutputStream

2018-09-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609066#comment-16609066
 ] 

ASF GitHub Bot commented on FLINK-10203:


art4ul commented on issue #6608: [FLINK-10203]Support truncate method for old 
Hadoop versions in HadoopRecoverableFsDataOutputStream
URL: https://github.com/apache/flink/pull/6608#issuecomment-419887998
 
 
   Hi guys! Maybe there are some comments regarding this PR? I would like to 
continue discussion


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support truncate method for old Hadoop versions in 
> HadoopRecoverableFsDataOutputStream
> --
>
> Key: FLINK-10203
> URL: https://issues.apache.org/jira/browse/FLINK-10203
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API, filesystem-connector
>Affects Versions: 1.6.0, 1.6.1, 1.7.0
>Reporter: Artsem Semianenka
>Assignee: Artsem Semianenka
>Priority: Major
>  Labels: pull-request-available
> Attachments: legacy truncate logic.pdf
>
>
> New StreamingFileSink ( introduced in 1.6 Flink version ) use 
> HadoopRecoverableFsDataOutputStream wrapper to write data in HDFS.
> HadoopRecoverableFsDataOutputStream is a wrapper for FSDataOutputStream to 
> have an ability to restore from certain point of file after failure and 
> continue write data. To achieve this recover functionality the 
> HadoopRecoverableFsDataOutputStream use "truncate" method which was 
> introduced only in Hadoop 2.7 .
> Unfortunately there are a few official Hadoop distributive which latest 
> version still use Hadoop 2.6 (This distributives: Cloudera, Pivotal HD ). As 
> the result Flinks Hadoop connector can't work with this distributives.
> Flink declares that supported Hadoop from version 2.4.0 upwards 
> ([https://ci.apache.org/projects/flink/flink-docs-release-1.6/start/building.html#hadoop-versions])
> I guess we should emulate the functionality of "truncate" method for older 
> Hadoop versions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10203) Support truncate method for old Hadoop versions in HadoopRecoverableFsDataOutputStream

2018-08-27 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16594261#comment-16594261
 ] 

ASF GitHub Bot commented on FLINK-10203:


art4ul commented on issue #6608: [FLINK-10203]Support truncate method for old 
Hadoop versions in HadoopRecoverableFsDataOutputStream
URL: https://github.com/apache/flink/pull/6608#issuecomment-416379325
 
 
   @StephanEwen I've attached the [PDF 
file](https://issues.apache.org/jira/secure/attachment/12937334/legacy%20truncate%20logic.pdf)
 with the description of legacy truncater logic to the [JIRA 
ticket](https://issues.apache.org/jira/browse/FLINK-10203).
   Please take a look :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support truncate method for old Hadoop versions in 
> HadoopRecoverableFsDataOutputStream
> --
>
> Key: FLINK-10203
> URL: https://issues.apache.org/jira/browse/FLINK-10203
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API, filesystem-connector
>Affects Versions: 1.6.0, 1.6.1, 1.7.0
>Reporter: Artsem Semianenka
>Assignee: Artsem Semianenka
>Priority: Major
>  Labels: pull-request-available
> Attachments: legacy truncate logic.pdf
>
>
> New StreamingFileSink ( introduced in 1.6 Flink version ) use 
> HadoopRecoverableFsDataOutputStream wrapper to write data in HDFS.
> HadoopRecoverableFsDataOutputStream is a wrapper for FSDataOutputStream to 
> have an ability to restore from certain point of file after failure and 
> continue write data. To achieve this recover functionality the 
> HadoopRecoverableFsDataOutputStream use "truncate" method which was 
> introduced only in Hadoop 2.7 .
> Unfortunately there are a few official Hadoop distributive which latest 
> version still use Hadoop 2.6 (This distributives: Cloudera, Pivotal HD ). As 
> the result Flinks Hadoop connector can't work with this distributives.
> Flink declares that supported Hadoop from version 2.4.0 upwards 
> ([https://ci.apache.org/projects/flink/flink-docs-release-1.6/start/building.html#hadoop-versions])
> I guess we should emulate the functionality of "truncate" method for older 
> Hadoop versions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10203) Support truncate method for old Hadoop versions in HadoopRecoverableFsDataOutputStream

2018-08-23 Thread Kostas Kloudas (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16590683#comment-16590683
 ] 

Kostas Kloudas commented on FLINK-10203:


No problem [~artsem.semianenka]! 
I assigned the issue to you. 

> Support truncate method for old Hadoop versions in 
> HadoopRecoverableFsDataOutputStream
> --
>
> Key: FLINK-10203
> URL: https://issues.apache.org/jira/browse/FLINK-10203
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API, filesystem-connector
>Affects Versions: 1.6.0, 1.6.1, 1.7.0
>Reporter: Artsem Semianenka
>Assignee: Artsem Semianenka
>Priority: Major
>  Labels: pull-request-available
>
> New StreamingFileSink ( introduced in 1.6 Flink version ) use 
> HadoopRecoverableFsDataOutputStream wrapper to write data in HDFS.
> HadoopRecoverableFsDataOutputStream is a wrapper for FSDataOutputStream to 
> have an ability to restore from certain point of file after failure and 
> continue write data. To achieve this recover functionality the 
> HadoopRecoverableFsDataOutputStream use "truncate" method which was 
> introduced only in Hadoop 2.7 .
> Unfortunately there are a few official Hadoop distributive which latest 
> version still use Hadoop 2.6 (This distributives: Cloudera, Pivotal HD ). As 
> the result Flinks Hadoop connector can't work with this distributives.
> Flink declares that supported Hadoop from version 2.4.0 upwards 
> ([https://ci.apache.org/projects/flink/flink-docs-release-1.6/start/building.html#hadoop-versions])
> I guess we should emulate the functionality of "truncate" method for older 
> Hadoop versions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10203) Support truncate method for old Hadoop versions in HadoopRecoverableFsDataOutputStream

2018-08-23 Thread Artsem Semianenka (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16590627#comment-16590627
 ] 

Artsem Semianenka commented on FLINK-10203:
---

I'm so sorry guys, this is my first time when I try to submit an issue into 
Apache project. 

> Support truncate method for old Hadoop versions in 
> HadoopRecoverableFsDataOutputStream
> --
>
> Key: FLINK-10203
> URL: https://issues.apache.org/jira/browse/FLINK-10203
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API, filesystem-connector
>Affects Versions: 1.6.0, 1.6.1, 1.7.0
>Reporter: Artsem Semianenka
>Priority: Major
>  Labels: pull-request-available
>
> New StreamingFileSink ( introduced in 1.6 Flink version ) use 
> HadoopRecoverableFsDataOutputStream wrapper to write data in HDFS.
> HadoopRecoverableFsDataOutputStream is a wrapper for FSDataOutputStream to 
> have an ability to restore from certain point of file after failure and 
> continue write data. To achieve this recover functionality the 
> HadoopRecoverableFsDataOutputStream use "truncate" method which was 
> introduced only in Hadoop 2.7 .
> Unfortunately there are a few official Hadoop distributive which latest 
> version still use Hadoop 2.6 (This distributives: Cloudera, Pivotal HD ). As 
> the result Flinks Hadoop connector can't work with this distributives.
> Flink declares that supported Hadoop from version 2.4.0 upwards 
> ([https://ci.apache.org/projects/flink/flink-docs-release-1.6/start/building.html#hadoop-versions])
> I guess we should emulate the functionality of "truncate" method for older 
> Hadoop versions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10203) Support truncate method for old Hadoop versions in HadoopRecoverableFsDataOutputStream

2018-08-23 Thread Kostas Kloudas (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16590607#comment-16590607
 ] 

Kostas Kloudas commented on FLINK-10203:


Hi [~artsem.semianenka]! 

Given that you are working on the issue, you should also assign it to yourself.

In addition, I link the discussion from the mailing list, so that we keep track 
of everything

https://mail-archives.apache.org/mod_mbox/flink-dev/201808.mbox/%3cf7af3ef2-1407-443e-a092-7b13a2467...@data-artisans.com%3E

Discussions are also better to happen in JIRA, before submitting PRs to Github. 
This is not only a personal opinion but I believe it is also a requirement from 
Apache 
as this is also Apache space, while Github is not.



> Support truncate method for old Hadoop versions in 
> HadoopRecoverableFsDataOutputStream
> --
>
> Key: FLINK-10203
> URL: https://issues.apache.org/jira/browse/FLINK-10203
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API, filesystem-connector
>Affects Versions: 1.6.0, 1.6.1, 1.7.0
>Reporter: Artsem Semianenka
>Priority: Major
>  Labels: pull-request-available
>
> New StreamingFileSink ( introduced in 1.6 Flink version ) use 
> HadoopRecoverableFsDataOutputStream wrapper to write data in HDFS.
> HadoopRecoverableFsDataOutputStream is a wrapper for FSDataOutputStream to 
> have an ability to restore from certain point of file after failure and 
> continue write data. To achieve this recover functionality the 
> HadoopRecoverableFsDataOutputStream use "truncate" method which was 
> introduced only in Hadoop 2.7 .
> Unfortunately there are a few official Hadoop distributive which latest 
> version still use Hadoop 2.6 (This distributives: Cloudera, Pivotal HD ). As 
> the result Flinks Hadoop connector can't work with this distributives.
> Flink declares that supported Hadoop from version 2.4.0 upwards 
> ([https://ci.apache.org/projects/flink/flink-docs-release-1.6/start/building.html#hadoop-versions])
> I guess we should emulate the functionality of "truncate" method for older 
> Hadoop versions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10203) Support truncate method for old Hadoop versions in HadoopRecoverableFsDataOutputStream

2018-08-23 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16590588#comment-16590588
 ] 

ASF GitHub Bot commented on FLINK-10203:


StephanEwen commented on issue #6608: [FLINK-10203]Support truncate method for 
old Hadoop versions in HadoopRecoverableFsDataOutputStream
URL: https://github.com/apache/flink/pull/6608#issuecomment-415507439
 
 
   I see, that is a fair reason. There are parallel efforts to add Parquet 
support to the old BucketingSink, but I see the point.
   
   Before going into a deep review,  can you update the description with how 
exactly the legacy truncater should be working: what copy and rename steps it 
does and how it behaves under failure / repeated calls. 
   
   Also, I would suggest to name it `Truncater` rather than `TruncateManager`. 
Too many managers all around already ;-)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support truncate method for old Hadoop versions in 
> HadoopRecoverableFsDataOutputStream
> --
>
> Key: FLINK-10203
> URL: https://issues.apache.org/jira/browse/FLINK-10203
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API, filesystem-connector
>Affects Versions: 1.6.0, 1.6.1, 1.7.0
>Reporter: Artsem Semianenka
>Priority: Major
>  Labels: pull-request-available
>
> New StreamingFileSink ( introduced in 1.6 Flink version ) use 
> HadoopRecoverableFsDataOutputStream wrapper to write data in HDFS.
> HadoopRecoverableFsDataOutputStream is a wrapper for FSDataOutputStream to 
> have an ability to restore from certain point of file after failure and 
> continue write data. To achieve this recover functionality the 
> HadoopRecoverableFsDataOutputStream use "truncate" method which was 
> introduced only in Hadoop 2.7 .
> Unfortunately there are a few official Hadoop distributive which latest 
> version still use Hadoop 2.6 (This distributives: Cloudera, Pivotal HD ). As 
> the result Flinks Hadoop connector can't work with this distributives.
> Flink declares that supported Hadoop from version 2.4.0 upwards 
> ([https://ci.apache.org/projects/flink/flink-docs-release-1.6/start/building.html#hadoop-versions])
> I guess we should emulate the functionality of "truncate" method for older 
> Hadoop versions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10203) Support truncate method for old Hadoop versions in HadoopRecoverableFsDataOutputStream

2018-08-23 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16590401#comment-16590401
 ] 

ASF GitHub Bot commented on FLINK-10203:


art4ul commented on issue #6608: [FLINK-10203]Support truncate method for old 
Hadoop versions in HadoopRecoverableFsDataOutputStream
URL: https://github.com/apache/flink/pull/6608#issuecomment-415465746
 
 
   @StephanEwen I try to explain you our case: the new Streaming FileSink 
ideally suitable for write into parquet files and we was so exited when founded 
it in the new 1.6 version , because we worked on our Sink implementation in 
parallel . But as I mention before we use latest Cloudera 5.15 distributive 
based on Hadoop 2.6 and unfortunately we can't upgrade it to higher version of 
Hadoop. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support truncate method for old Hadoop versions in 
> HadoopRecoverableFsDataOutputStream
> --
>
> Key: FLINK-10203
> URL: https://issues.apache.org/jira/browse/FLINK-10203
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API, filesystem-connector
>Affects Versions: 1.6.0, 1.6.1, 1.7.0
>Reporter: Artsem Semianenka
>Priority: Major
>  Labels: pull-request-available
>
> New StreamingFileSink ( introduced in 1.6 Flink version ) use 
> HadoopRecoverableFsDataOutputStream wrapper to write data in HDFS.
> HadoopRecoverableFsDataOutputStream is a wrapper for FSDataOutputStream to 
> have an ability to restore from certain point of file after failure and 
> continue write data. To achieve this recover functionality the 
> HadoopRecoverableFsDataOutputStream use "truncate" method which was 
> introduced only in Hadoop 2.7 .
> Unfortunately there are a few official Hadoop distributive which latest 
> version still use Hadoop 2.6 (This distributives: Cloudera, Pivotal HD ). As 
> the result Flinks Hadoop connector can't work with this distributives.
> Flink declares that supported Hadoop from version 2.4.0 upwards 
> ([https://ci.apache.org/projects/flink/flink-docs-release-1.6/start/building.html#hadoop-versions])
> I guess we should emulate the functionality of "truncate" method for older 
> Hadoop versions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10203) Support truncate method for old Hadoop versions in HadoopRecoverableFsDataOutputStream

2018-08-23 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16590362#comment-16590362
 ] 

ASF GitHub Bot commented on FLINK-10203:


art4ul commented on issue #6608: [FLINK-10203]Support truncate method for old 
Hadoop versions in HadoopRecoverableFsDataOutputStream
URL: https://github.com/apache/flink/pull/6608#issuecomment-415453555
 
 
   @StephanEwen Yes it's critical because we need to write Parquet files .


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support truncate method for old Hadoop versions in 
> HadoopRecoverableFsDataOutputStream
> --
>
> Key: FLINK-10203
> URL: https://issues.apache.org/jira/browse/FLINK-10203
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API, filesystem-connector
>Affects Versions: 1.6.0, 1.6.1, 1.7.0
>Reporter: Artsem Semianenka
>Priority: Major
>  Labels: pull-request-available
>
> New StreamingFileSink ( introduced in 1.6 Flink version ) use 
> HadoopRecoverableFsDataOutputStream wrapper to write data in HDFS.
> HadoopRecoverableFsDataOutputStream is a wrapper for FSDataOutputStream to 
> have an ability to restore from certain point of file after failure and 
> continue write data. To achieve this recover functionality the 
> HadoopRecoverableFsDataOutputStream use "truncate" method which was 
> introduced only in Hadoop 2.7 .
> Unfortunately there are a few official Hadoop distributive which latest 
> version still use Hadoop 2.6 (This distributives: Cloudera, Pivotal HD ). As 
> the result Flinks Hadoop connector can't work with this distributives.
> Flink declares that supported Hadoop from version 2.4.0 upwards 
> ([https://ci.apache.org/projects/flink/flink-docs-release-1.6/start/building.html#hadoop-versions])
> I guess we should emulate the functionality of "truncate" method for older 
> Hadoop versions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10203) Support truncate method for old Hadoop versions in HadoopRecoverableFsDataOutputStream

2018-08-23 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16590352#comment-16590352
 ] 

ASF GitHub Bot commented on FLINK-10203:


StephanEwen commented on issue #6608: [FLINK-10203]Support truncate method for 
old Hadoop versions in HadoopRecoverableFsDataOutputStream
URL: https://github.com/apache/flink/pull/6608#issuecomment-415451715
 
 
   @art4ul Initially, we wanted to keep the new StreamingFileSink code simple 
and clean, that's why we decided to only with Hadoop 2.7+ for HDFS and retained 
the old BucketingSink, to support prior Hadoop versions.
   
   Is it a critical issue on your side to not use the BucketingSink?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support truncate method for old Hadoop versions in 
> HadoopRecoverableFsDataOutputStream
> --
>
> Key: FLINK-10203
> URL: https://issues.apache.org/jira/browse/FLINK-10203
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API, filesystem-connector
>Affects Versions: 1.6.0, 1.6.1, 1.7.0
>Reporter: Artsem Semianenka
>Priority: Major
>  Labels: pull-request-available
>
> New StreamingFileSink ( introduced in 1.6 Flink version ) use 
> HadoopRecoverableFsDataOutputStream wrapper to write data in HDFS.
> HadoopRecoverableFsDataOutputStream is an wrapper on FSDataOutputStream to 
> have an ability to restore from certain point of file after failure and 
> continue write data. To achive this recover functionality the 
> HadoopRecoverableFsDataOutputStream use "truncate" method which was 
> introduced only in Hadoop 2.7 .
> Unfortently there are a few official Hadoop distibutives which latast version 
> still use Hadoop 2.6 (This distibutives: Cloudera, Pivotal HD ). As the 
> result Flink can't work with this distibutives.
> Flink declares that supported Hadoop from version 2.4.0 upwards 
> (https://ci.apache.org/projects/flink/flink-docs-release-1.6/start/building.html#hadoop-versions)
> I guess we should emulate the functionality of "truncate" method for older 
> Hadoop versions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10203) Support truncate method for old Hadoop versions in HadoopRecoverableFsDataOutputStream

2018-08-23 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16590284#comment-16590284
 ] 

ASF GitHub Bot commented on FLINK-10203:


art4ul opened a new pull request #6608: [FLINK-10203]Support truncate method 
for old Hadoop versions in HadoopRecoverableFsDataOutputStream
URL: https://github.com/apache/flink/pull/6608
 
 
   ## What is the purpose of the change
   
   New StreamingFileSink ( introduced in 1.6 Flink version ) use 
HadoopRecoverableFsDataOutputStream wrapper to write data in HDFS.
   
   HadoopRecoverableFsDataOutputStream is an wrapper on FSDataOutputStream to 
have an ability to restore from certain point of file after failure and 
continue write data. To achive this recover functionality the 
HadoopRecoverableFsDataOutputStream use "truncate" method which was introduced 
only in Hadoop 2.7 .
   
   Unfortently there are a few official Hadoop distibutives which latast 
version still use Hadoop 2.6 (This distibutives: Cloudera, Pivotal HD ). As the 
result Flink can't work with this distibutives.
   
   Flink declares that supported Hadoop from version 2.4.0 upwards 
(https://ci.apache.org/projects/flink/flink-docs-release-1.6/start/building.html#hadoop-versions)
   
   I guess we should emulate the functionality of "truncate" method for older 
Hadoop versions.
   
   This is possible fix . I would like to start discussion here. 
   The fix of this issue is vital for us as Hadoop 2.6 users.
   
   ## Brief change log
   
 - Add new abstraction TruncateManager
 - Add Implementation for old Hadoop version ( LegacyTruncateManager)
 - Add Implementation for Hadoop 2.7 and upwards
   
   
   ## Verifying this change
   
   This change contains only possible solution but currently without any test 
coverage.
   Tests will be added after final consensus about implementation .
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: ( no)
 - The runtime per-record code paths (performance sensitive): (don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes)
 - The S3 file system connector: ( no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no, but I think it 
should be documented)
 - If yes, how is the feature documented? (not documented yet)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support truncate method for old Hadoop versions in 
> HadoopRecoverableFsDataOutputStream
> --
>
> Key: FLINK-10203
> URL: https://issues.apache.org/jira/browse/FLINK-10203
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API, filesystem-connector
>Affects Versions: 1.6.0, 1.6.1, 1.7.0
>Reporter: Artsem Semianenka
>Priority: Major
>  Labels: pull-request-available
>
> New StreamingFileSink ( introduced in 1.6 Flink version ) use 
> HadoopRecoverableFsDataOutputStream wrapper to write data in HDFS.
> HadoopRecoverableFsDataOutputStream is an wrapper on FSDataOutputStream to 
> have an ability to restore from certain point of file after failure and 
> continue write data. To achive this recover functionality the 
> HadoopRecoverableFsDataOutputStream use "truncate" method which was 
> introduced only in Hadoop 2.7 .
> Unfortently there are a few official Hadoop distibutives which latast version 
> still use Hadoop 2.6 (This distibutives: Cloudera, Pivotal HD ). As the 
> result Flink can't work with this distibutives.
> Flink declares that supported Hadoop from version 2.4.0 upwards 
> (https://ci.apache.org/projects/flink/flink-docs-release-1.6/start/building.html#hadoop-versions)
> I guess we should emulate the functionality of "truncate" method for older 
> Hadoop versions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)