[jira] [Assigned] (HUDI-1667) Fix bug when HoodieMergeOnReadRDD read record from base file, Hoodie may set non-null value in field which is null if vectorization is enabled.

2021-03-21 Thread Lietong Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lietong Liu reassigned HUDI-1667:
-

Assignee: Lietong Liu

> Fix bug when HoodieMergeOnReadRDD read record from base file, Hoodie may set 
> non-null value in field which is null if vectorization is enabled.
> ---
>
> Key: HUDI-1667
> URL: https://issues.apache.org/jira/browse/HUDI-1667
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Common Core
>Reporter: Lietong Liu
>Assignee: Lietong Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.6.0
>
>
> When HoodieMergeOnReadRDD read record from base file,  will create new 
> InternalRow base on requiredStructSchema.
> {code:java}
> //代码占位符
> private def createRowWithRequiredSchema(row: InternalRow): InternalRow = {
>   val rowToReturn = new SpecificInternalRow(tableState.requiredStructSchema)
>   val posIterator = requiredFieldPosition.iterator
>   var curIndex = 0
>   tableState.requiredStructSchema.foreach(
> f => {
>   val curPos = posIterator.next()
>   val curField = row.get(curPos, f.dataType)
>   rowToReturn.update(curIndex, curField)
>   curIndex = curIndex + 1
> }
>   )
>   rowToReturn
> }
> {code}
>  Hoodie doesn't check isNull when get value from all fields here.
> If vectorization is enabled, which  means row is *ColumnarBatchRow*_*.*_  
> ***ColumnarBatchRow* may return non-null value even if value of field is 
> null. So, hoodie may set non-null value in field which is null.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HUDI-1667) Fix bug when HoodieMergeOnReadRDD read record from base file, Hoodie may set non-null value in field which is null if vectorization is enabled.

2021-03-21 Thread Lietong Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lietong Liu resolved HUDI-1667.
---
Resolution: Fixed

> Fix bug when HoodieMergeOnReadRDD read record from base file, Hoodie may set 
> non-null value in field which is null if vectorization is enabled.
> ---
>
> Key: HUDI-1667
> URL: https://issues.apache.org/jira/browse/HUDI-1667
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Common Core
>Reporter: Lietong Liu
>Assignee: Lietong Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.6.0
>
>
> When HoodieMergeOnReadRDD read record from base file,  will create new 
> InternalRow base on requiredStructSchema.
> {code:java}
> //代码占位符
> private def createRowWithRequiredSchema(row: InternalRow): InternalRow = {
>   val rowToReturn = new SpecificInternalRow(tableState.requiredStructSchema)
>   val posIterator = requiredFieldPosition.iterator
>   var curIndex = 0
>   tableState.requiredStructSchema.foreach(
> f => {
>   val curPos = posIterator.next()
>   val curField = row.get(curPos, f.dataType)
>   rowToReturn.update(curIndex, curField)
>   curIndex = curIndex + 1
> }
>   )
>   rowToReturn
> }
> {code}
>  Hoodie doesn't check isNull when get value from all fields here.
> If vectorization is enabled, which  means row is *ColumnarBatchRow*_*.*_  
> ***ColumnarBatchRow* may return non-null value even if value of field is 
> null. So, hoodie may set non-null value in field which is null.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-1667) Fix bug when HoodieMergeOnReadRDD read record from base file, Hoodie may set non-null value in field which is null if vectorization is enabled.

2021-03-21 Thread Lietong Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lietong Liu updated HUDI-1667:
--
Status: In Progress  (was: Open)

> Fix bug when HoodieMergeOnReadRDD read record from base file, Hoodie may set 
> non-null value in field which is null if vectorization is enabled.
> ---
>
> Key: HUDI-1667
> URL: https://issues.apache.org/jira/browse/HUDI-1667
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Common Core
>Reporter: Lietong Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.6.0
>
>
> When HoodieMergeOnReadRDD read record from base file,  will create new 
> InternalRow base on requiredStructSchema.
> {code:java}
> //代码占位符
> private def createRowWithRequiredSchema(row: InternalRow): InternalRow = {
>   val rowToReturn = new SpecificInternalRow(tableState.requiredStructSchema)
>   val posIterator = requiredFieldPosition.iterator
>   var curIndex = 0
>   tableState.requiredStructSchema.foreach(
> f => {
>   val curPos = posIterator.next()
>   val curField = row.get(curPos, f.dataType)
>   rowToReturn.update(curIndex, curField)
>   curIndex = curIndex + 1
> }
>   )
>   rowToReturn
> }
> {code}
>  Hoodie doesn't check isNull when get value from all fields here.
> If vectorization is enabled, which  means row is *ColumnarBatchRow*_*.*_  
> ***ColumnarBatchRow* may return non-null value even if value of field is 
> null. So, hoodie may set non-null value in field which is null.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-1667) Fix bug when HoodieMergeOnReadRDD read record from base file, Hoodie may set non-null value in field which is null if vectorization is enabled.

2021-03-05 Thread Lietong Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lietong Liu updated HUDI-1667:
--
Fix Version/s: 0.6.0
  Description: 
When HoodieMergeOnReadRDD read record from base file,  will create new 
InternalRow base on requiredStructSchema.
{code:java}
//代码占位符
private def createRowWithRequiredSchema(row: InternalRow): InternalRow = {
  val rowToReturn = new SpecificInternalRow(tableState.requiredStructSchema)
  val posIterator = requiredFieldPosition.iterator
  var curIndex = 0
  tableState.requiredStructSchema.foreach(
f => {
  val curPos = posIterator.next()
  val curField = row.get(curPos, f.dataType)
  rowToReturn.update(curIndex, curField)
  curIndex = curIndex + 1
}
  )
  rowToReturn
}

{code}
 Hoodie doesn't check isNull when get value from all fields here.

If vectorization is enabled, which  means row is *ColumnarBatchRow*_*.*_  
***ColumnarBatchRow* may return non-null value even if value of field is null. 
So, hoodie may set non-null value in field which is null.

  was:
When HoodieMergeOnReadRDD read record from base file,  will create new 
InternalRow base on requiredStructSchema.
{code:java}
//代码占位符
private def createRowWithRequiredSchema(row: InternalRow): InternalRow = {
  val rowToReturn = new SpecificInternalRow(tableState.requiredStructSchema)
  val posIterator = requiredFieldPosition.iterator
  var curIndex = 0
  tableState.requiredStructSchema.foreach(
f => {
  val curPos = posIterator.next()
  val curField = row.get(curPos, f.dataType)
  rowToReturn.update(curIndex, curField)
  curIndex = curIndex + 1
}
  )
  rowToReturn
}

{code}
 


> Fix bug when HoodieMergeOnReadRDD read record from base file, Hoodie may set 
> non-null value in field which is null if vectorization is enabled.
> ---
>
> Key: HUDI-1667
> URL: https://issues.apache.org/jira/browse/HUDI-1667
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Common Core
>Reporter: Lietong Liu
>Priority: Major
> Fix For: 0.6.0
>
>
> When HoodieMergeOnReadRDD read record from base file,  will create new 
> InternalRow base on requiredStructSchema.
> {code:java}
> //代码占位符
> private def createRowWithRequiredSchema(row: InternalRow): InternalRow = {
>   val rowToReturn = new SpecificInternalRow(tableState.requiredStructSchema)
>   val posIterator = requiredFieldPosition.iterator
>   var curIndex = 0
>   tableState.requiredStructSchema.foreach(
> f => {
>   val curPos = posIterator.next()
>   val curField = row.get(curPos, f.dataType)
>   rowToReturn.update(curIndex, curField)
>   curIndex = curIndex + 1
> }
>   )
>   rowToReturn
> }
> {code}
>  Hoodie doesn't check isNull when get value from all fields here.
> If vectorization is enabled, which  means row is *ColumnarBatchRow*_*.*_  
> ***ColumnarBatchRow* may return non-null value even if value of field is 
> null. So, hoodie may set non-null value in field which is null.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HUDI-1667) Fix bug when HoodieMergeOnReadRDD read record from base file, Hoodie may set non-null value in field which is null if vectorization is enabled.

2021-03-05 Thread Lietong Liu (Jira)
Lietong Liu created HUDI-1667:
-

 Summary: Fix bug when HoodieMergeOnReadRDD read record from base 
file, Hoodie may set non-null value in field which is null if vectorization is 
enabled.
 Key: HUDI-1667
 URL: https://issues.apache.org/jira/browse/HUDI-1667
 Project: Apache Hudi
  Issue Type: Bug
  Components: Common Core
Reporter: Lietong Liu


When HoodieMergeOnReadRDD read record from base file,  will create new 
InternalRow base on requiredStructSchema.
{code:java}
//代码占位符
private def createRowWithRequiredSchema(row: InternalRow): InternalRow = {
  val rowToReturn = new SpecificInternalRow(tableState.requiredStructSchema)
  val posIterator = requiredFieldPosition.iterator
  var curIndex = 0
  tableState.requiredStructSchema.foreach(
f => {
  val curPos = posIterator.next()
  val curField = row.get(curPos, f.dataType)
  rowToReturn.update(curIndex, curField)
  curIndex = curIndex + 1
}
  )
  rowToReturn
}

{code}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HUDI-1583) Hudi will skip remaining log files if there is logFile with zero size in logFileList when merge on read.

2021-03-05 Thread Lietong Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lietong Liu resolved HUDI-1583.
---
Resolution: Fixed

> Hudi will skip remaining log files if there is logFile with zero size in  
> logFileList when merge on read.
> -
>
> Key: HUDI-1583
> URL: https://issues.apache.org/jira/browse/HUDI-1583
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Common Core
>Affects Versions: 0.6.0
>Reporter: Lietong Liu
>Priority: Major
> Fix For: 0.6.0
>
>
> When  'spark.speculation' is enabled, there may be logFile with zero size.
> *HoodieLogFormatReader.hasNext()* will return false when encounter logFile  
> with zero size,which will skip remaining log files。
>  
> {code:java}
> @Override
>  public boolean hasNext() {
> if (currentReader == null)
> { return false; }
> else if (currentReader.hasNext())
> { return true; }
> else if (logFiles.size() > 0) {
>  try {
>  HoodieLogFile nextLogFile = logFiles.remove(0);
>  // First close previous reader only if readBlockLazily is true
>  if (!readBlocksLazily)
> { this.currentReader.close(); }
> else
> { this.prevReadersInOpenState.add(currentReader); }
> this.currentReader =
>  new HoodieLogFileReader(fs, nextLogFile, readerSchema, bufferSize, 
> readBlocksLazily, false);
>  } catch (IOException io)
> { throw new HoodieIOException("unable to initialize read with log file ", 
> io); }
> LOG.info("Moving to the next reader for logfile " + 
> currentReader.getLogFile());
>  return this.currentReader.hasNext() || hasNext();
>  }
>  return false;
>  }
>  
> {code}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-1583) Hudi will skip remaining log files if there is logFile with zero size in logFileList when merge on read.

2021-02-19 Thread Lietong Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lietong Liu updated HUDI-1583:
--
Description: 
When  'spark.speculation' is enabled, there may be logFile with zero size.

*HoodieLogFormatReader.hasNext()* will return false when encounter logFile  
with zero size,which will skip remaining log files。

 
{code:java}
@Override
 public boolean hasNext() {
if (currentReader == null)
{ return false; }
else if (currentReader.hasNext())
{ return true; }
else if (logFiles.size() > 0) {
 try {
 HoodieLogFile nextLogFile = logFiles.remove(0);
 // First close previous reader only if readBlockLazily is true
 if (!readBlocksLazily)
{ this.currentReader.close(); }
else
{ this.prevReadersInOpenState.add(currentReader); }
this.currentReader =
 new HoodieLogFileReader(fs, nextLogFile, readerSchema, bufferSize, 
readBlocksLazily, false);
 } catch (IOException io)
{ throw new HoodieIOException("unable to initialize read with log file ", io); }
LOG.info("Moving to the next reader for logfile " + currentReader.getLogFile());
 return this.currentReader.hasNext() || hasNext();
 }
 return false;
 }
 
{code}
 

 

 

  was:
When `spark.speculation` is enabled, there may be logFile with zero size.

`HoodieLogFormatReader.hasNext()` will return false when encounter logFile  
with zero size,which will skip remaining log files。

```

@Override
public boolean hasNext() {

 if (currentReader == null) {
 return false;
 } else if (currentReader.hasNext()) {
 return true;
 } else if (logFiles.size() > 0) {
 try {
 HoodieLogFile nextLogFile = logFiles.remove(0);
 // First close previous reader only if readBlockLazily is true
 if (!readBlocksLazily) {
 this.currentReader.close();
 } else {
 this.prevReadersInOpenState.add(currentReader);
 }
 this.currentReader =
 new HoodieLogFileReader(fs, nextLogFile, readerSchema, bufferSize, 
readBlocksLazily, false);
 } catch (IOException io) {
 throw new HoodieIOException("unable to initialize read with log file ", io);
 }
 LOG.info("Moving to the next reader for logfile " + 
currentReader.getLogFile());
 return this.currentReader.hasNext() || hasNext();
 }
 return false;
}

```

 

 

 


> Hudi will skip remaining log files if there is logFile with zero size in  
> logFileList when merge on read.
> -
>
> Key: HUDI-1583
> URL: https://issues.apache.org/jira/browse/HUDI-1583
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Common Core
>Affects Versions: 0.6.0
>Reporter: Lietong Liu
>Priority: Major
> Fix For: 0.6.0
>
>
> When  'spark.speculation' is enabled, there may be logFile with zero size.
> *HoodieLogFormatReader.hasNext()* will return false when encounter logFile  
> with zero size,which will skip remaining log files。
>  
> {code:java}
> @Override
>  public boolean hasNext() {
> if (currentReader == null)
> { return false; }
> else if (currentReader.hasNext())
> { return true; }
> else if (logFiles.size() > 0) {
>  try {
>  HoodieLogFile nextLogFile = logFiles.remove(0);
>  // First close previous reader only if readBlockLazily is true
>  if (!readBlocksLazily)
> { this.currentReader.close(); }
> else
> { this.prevReadersInOpenState.add(currentReader); }
> this.currentReader =
>  new HoodieLogFileReader(fs, nextLogFile, readerSchema, bufferSize, 
> readBlocksLazily, false);
>  } catch (IOException io)
> { throw new HoodieIOException("unable to initialize read with log file ", 
> io); }
> LOG.info("Moving to the next reader for logfile " + 
> currentReader.getLogFile());
>  return this.currentReader.hasNext() || hasNext();
>  }
>  return false;
>  }
>  
> {code}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-1583) Hudi will skip remaining log files if there is logFile with zero size in logFileList when merge on read.

2021-02-19 Thread Lietong Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lietong Liu updated HUDI-1583:
--
Description: 
When `spark.speculation` is enabled, there may be logFile with zero size.

`HoodieLogFormatReader.hasNext()` will return false when encounter logFile  
with zero size,which will skip remaining log files。

```

@Override
public boolean hasNext() {

 if (currentReader == null) {
 return false;
 } else if (currentReader.hasNext()) {
 return true;
 } else if (logFiles.size() > 0) {
 try {
 HoodieLogFile nextLogFile = logFiles.remove(0);
 // First close previous reader only if readBlockLazily is true
 if (!readBlocksLazily) {
 this.currentReader.close();
 } else {
 this.prevReadersInOpenState.add(currentReader);
 }
 this.currentReader =
 new HoodieLogFileReader(fs, nextLogFile, readerSchema, bufferSize, 
readBlocksLazily, false);
 } catch (IOException io) {
 throw new HoodieIOException("unable to initialize read with log file ", io);
 }
 LOG.info("Moving to the next reader for logfile " + 
currentReader.getLogFile());
 return this.currentReader.hasNext() || hasNext();
 }
 return false;
}

```

 

 

 

  was:
When `spark.speculation` is enabled, there may be logFile with zero size.

`HoodieLogFormatReader.hasNext()` will return false when encounter logFile  
with zero size,which will skip remaining log files。

 

 

 


> Hudi will skip remaining log files if there is logFile with zero size in  
> logFileList when merge on read.
> -
>
> Key: HUDI-1583
> URL: https://issues.apache.org/jira/browse/HUDI-1583
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Common Core
>Affects Versions: 0.6.0
>Reporter: Lietong Liu
>Priority: Major
> Fix For: 0.6.0
>
>
> When `spark.speculation` is enabled, there may be logFile with zero size.
> `HoodieLogFormatReader.hasNext()` will return false when encounter logFile  
> with zero size,which will skip remaining log files。
> ```
> @Override
> public boolean hasNext() {
>  if (currentReader == null) {
>  return false;
>  } else if (currentReader.hasNext()) {
>  return true;
>  } else if (logFiles.size() > 0) {
>  try {
>  HoodieLogFile nextLogFile = logFiles.remove(0);
>  // First close previous reader only if readBlockLazily is true
>  if (!readBlocksLazily) {
>  this.currentReader.close();
>  } else {
>  this.prevReadersInOpenState.add(currentReader);
>  }
>  this.currentReader =
>  new HoodieLogFileReader(fs, nextLogFile, readerSchema, bufferSize, 
> readBlocksLazily, false);
>  } catch (IOException io) {
>  throw new HoodieIOException("unable to initialize read with log file ", io);
>  }
>  LOG.info("Moving to the next reader for logfile " + 
> currentReader.getLogFile());
>  return this.currentReader.hasNext() || hasNext();
>  }
>  return false;
> }
> ```
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-1583) Hudi will skip remaining log files if there is logFile with zero size in logFileList when merge on read.

2021-02-19 Thread Lietong Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lietong Liu updated HUDI-1583:
--
Attachment: (was: image-2021-02-19-19-07-49-264.png)

> Hudi will skip remaining log files if there is logFile with zero size in  
> logFileList when merge on read.
> -
>
> Key: HUDI-1583
> URL: https://issues.apache.org/jira/browse/HUDI-1583
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Common Core
>Affects Versions: 0.6.0
>Reporter: Lietong Liu
>Priority: Major
> Fix For: 0.6.0
>
>
> When `spark.speculation` is enabled, there may be logFile with zero size.
> `HoodieLogFormatReader.hasNext()` will return false when encounter logFile  
> with zero size,which will skip remaining log files。
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-1583) Hudi will skip remaining log files if there is logFile with zero size in logFileList when merge on read.

2021-02-19 Thread Lietong Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lietong Liu updated HUDI-1583:
--
Description: 
When `spark.speculation` is enabled, there may be logFile with zero size.

`HoodieLogFormatReader.hasNext()` will return false when encounter logFile  
with zero size,which will skip remaining log files。

 

 

 

  was:
When `spark.speculation` is enabled, there may be logFile with zero size.

`HoodieLogFormatReader.hasNext()` will return false when encounter logFile  
with zero size,which will skip remaining log files。

 

!image-2021-02-19-19-07-49-264.png!

 


> Hudi will skip remaining log files if there is logFile with zero size in  
> logFileList when merge on read.
> -
>
> Key: HUDI-1583
> URL: https://issues.apache.org/jira/browse/HUDI-1583
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Common Core
>Affects Versions: 0.6.0
>Reporter: Lietong Liu
>Priority: Major
> Fix For: 0.6.0
>
>
> When `spark.speculation` is enabled, there may be logFile with zero size.
> `HoodieLogFormatReader.hasNext()` will return false when encounter logFile  
> with zero size,which will skip remaining log files。
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-1583) Hudi will skip remaining log files if there is logFile with zero size in logFileList when merge on read.

2021-02-19 Thread Lietong Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lietong Liu updated HUDI-1583:
--
Attachment: image-2021-02-19-19-07-49-264.png

> Hudi will skip remaining log files if there is logFile with zero size in  
> logFileList when merge on read.
> -
>
> Key: HUDI-1583
> URL: https://issues.apache.org/jira/browse/HUDI-1583
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Common Core
>Affects Versions: 0.6.0
>Reporter: Lietong Liu
>Priority: Major
> Fix For: 0.6.0
>
> Attachments: image-2021-02-19-19-07-49-264.png
>
>
> When `spark.speculation` is enabled, there may be logFile with zero size.
> `HoodieLogFormatReader.hasNext()` will return false when encounter logFile  
> with zero size,which will skip remaining log files。
>  
> !image-2021-02-19-19-07-49-264.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-1583) Hudi will skip remaining log files if there is logFile with zero size in logFileList when merge on read.

2021-02-19 Thread Lietong Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lietong Liu updated HUDI-1583:
--
Description: 
When `spark.speculation` is enabled, there may be logFile with zero size.

`HoodieLogFormatReader.hasNext()` will return false when encounter logFile  
with zero size,which will skip remaining log files。

 

!image-2021-02-19-19-07-49-264.png!

 

> Hudi will skip remaining log files if there is logFile with zero size in  
> logFileList when merge on read.
> -
>
> Key: HUDI-1583
> URL: https://issues.apache.org/jira/browse/HUDI-1583
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Common Core
>Affects Versions: 0.6.0
>Reporter: Lietong Liu
>Priority: Major
> Fix For: 0.6.0
>
> Attachments: image-2021-02-19-19-07-49-264.png
>
>
> When `spark.speculation` is enabled, there may be logFile with zero size.
> `HoodieLogFormatReader.hasNext()` will return false when encounter logFile  
> with zero size,which will skip remaining log files。
>  
> !image-2021-02-19-19-07-49-264.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HUDI-1583) Hudi will skip remaining log files if there is logFile with zero size in logFileList when merge on read.

2021-02-04 Thread Lietong Liu (Jira)
Lietong Liu created HUDI-1583:
-

 Summary: Hudi will skip remaining log files if there is logFile 
with zero size in  logFileList when merge on read.
 Key: HUDI-1583
 URL: https://issues.apache.org/jira/browse/HUDI-1583
 Project: Apache Hudi
  Issue Type: Bug
  Components: Common Core
Affects Versions: 0.6.0
Reporter: Lietong Liu
 Fix For: 0.6.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)