[jira] [Comment Edited] (ARROW-8909) [Java] Out of order writes using setSafe
[ https://issues.apache.org/jira/browse/ARROW-8909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17126397#comment-17126397 ] Saurabh edited comment on ARROW-8909 at 6/5/20, 5:12 AM: - [~fan_li_ya] Thanks for the response and for updating the documentation. was (Author: saurabhm): [~fan_li_ya] Thanks for the response and thanks for updating the documentation. > [Java] Out of order writes using setSafe > > > Key: ARROW-8909 > URL: https://issues.apache.org/jira/browse/ARROW-8909 > Project: Apache Arrow > Issue Type: Bug > Components: Java >Reporter: Saurabh >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > I noticed that calling setSafe on a VarCharVector with indices not in > increasing order causes the lastIndex to be set to the index in the last call > to setSafe. > Is this a documented and expected behavior ? > Sample code: > {code:java} > import java.util.Collections; > import lombok.extern.slf4j.Slf4j; > import org.apache.arrow.memory.RootAllocator; > import org.apache.arrow.vector.VarCharVector; > import org.apache.arrow.vector.VectorSchemaRoot; > import org.apache.arrow.vector.types.pojo.ArrowType; > import org.apache.arrow.vector.types.pojo.Field; > import org.apache.arrow.vector.types.pojo.Schema; > import org.apache.arrow.vector.util.Text; > @Slf4j > public class ATest { > public static void main() { > Schema schema = new > Schema(Collections.singletonList(Field.nullable("Data", new > ArrowType.Utf8(; > try (VectorSchemaRoot vroot = VectorSchemaRoot.create(schema, new > RootAllocator())) { > VarCharVector vec = (VarCharVector) vroot.getVector("Data"); > for (int i = 0; i < 10; i++) { > vec.setSafe(i, new Text(Integer.toString(i) + "_mtest")); > } > vec.setSafe(7, new Text(Integer.toString(7) + "_new")); > log.info("Data at index 8 Before {}", vec.getObject(8)); > vroot.setRowCount(10); > log.info("Data at index 8 After {}", vec.getObject(8)); > log.info(vroot.contentToTSVString()); > } > } > } > {code} > > If I don't set the index 7 after the loop, I get all the 0_mtest, 1_mtest, > ..., 9_mtest entries. > If I set index 7 after the loop, I see 0_mtest, ..., 5_mtest, 6_mtext, 7_new, > Before the setRowCount, the data at index 8 is -> *st8_mtest* ; index 9 > is *9_mtest* > After the setRowCount, the data at index 8 is -> "" ; index 9 is "" > With a text with more chars instead of 4 with _new, it keeps eating into the > data at the following indices. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-8909) [Java] Out of order writes using setSafe
[ https://issues.apache.org/jira/browse/ARROW-8909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17126397#comment-17126397 ] Saurabh commented on ARROW-8909: [~fan_li_ya] Thanks for the response and thanks for updating the documentation. > [Java] Out of order writes using setSafe > > > Key: ARROW-8909 > URL: https://issues.apache.org/jira/browse/ARROW-8909 > Project: Apache Arrow > Issue Type: Bug > Components: Java >Reporter: Saurabh >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > I noticed that calling setSafe on a VarCharVector with indices not in > increasing order causes the lastIndex to be set to the index in the last call > to setSafe. > Is this a documented and expected behavior ? > Sample code: > {code:java} > import java.util.Collections; > import lombok.extern.slf4j.Slf4j; > import org.apache.arrow.memory.RootAllocator; > import org.apache.arrow.vector.VarCharVector; > import org.apache.arrow.vector.VectorSchemaRoot; > import org.apache.arrow.vector.types.pojo.ArrowType; > import org.apache.arrow.vector.types.pojo.Field; > import org.apache.arrow.vector.types.pojo.Schema; > import org.apache.arrow.vector.util.Text; > @Slf4j > public class ATest { > public static void main() { > Schema schema = new > Schema(Collections.singletonList(Field.nullable("Data", new > ArrowType.Utf8(; > try (VectorSchemaRoot vroot = VectorSchemaRoot.create(schema, new > RootAllocator())) { > VarCharVector vec = (VarCharVector) vroot.getVector("Data"); > for (int i = 0; i < 10; i++) { > vec.setSafe(i, new Text(Integer.toString(i) + "_mtest")); > } > vec.setSafe(7, new Text(Integer.toString(7) + "_new")); > log.info("Data at index 8 Before {}", vec.getObject(8)); > vroot.setRowCount(10); > log.info("Data at index 8 After {}", vec.getObject(8)); > log.info(vroot.contentToTSVString()); > } > } > } > {code} > > If I don't set the index 7 after the loop, I get all the 0_mtest, 1_mtest, > ..., 9_mtest entries. > If I set index 7 after the loop, I see 0_mtest, ..., 5_mtest, 6_mtext, 7_new, > Before the setRowCount, the data at index 8 is -> *st8_mtest* ; index 9 > is *9_mtest* > After the setRowCount, the data at index 8 is -> "" ; index 9 is "" > With a text with more chars instead of 4 with _new, it keeps eating into the > data at the following indices. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-8909) [Java] Out of order writes using setSafe
[ https://issues.apache.org/jira/browse/ARROW-8909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saurabh updated ARROW-8909: --- Description: I noticed that calling setSafe on a VarCharVector with indices not in increasing order causes the lastIndex to be set to the index in the last call to setSafe. Is this a documented and expected behavior ? Sample code: {code:java} import java.util.Collections; import lombok.extern.slf4j.Slf4j; import org.apache.arrow.memory.RootAllocator; import org.apache.arrow.vector.VarCharVector; import org.apache.arrow.vector.VectorSchemaRoot; import org.apache.arrow.vector.types.pojo.ArrowType; import org.apache.arrow.vector.types.pojo.Field; import org.apache.arrow.vector.types.pojo.Schema; import org.apache.arrow.vector.util.Text; @Slf4j public class ATest { public static void main() { Schema schema = new Schema(Collections.singletonList(Field.nullable("Data", new ArrowType.Utf8(; try (VectorSchemaRoot vroot = VectorSchemaRoot.create(schema, new RootAllocator())) { VarCharVector vec = (VarCharVector) vroot.getVector("Data"); for (int i = 0; i < 10; i++) { vec.setSafe(i, new Text(Integer.toString(i) + "_mtest")); } vec.setSafe(7, new Text(Integer.toString(7) + "_new")); log.info("Data at index 8 Before {}", vec.getObject(8)); vroot.setRowCount(10); log.info("Data at index 8 After {}", vec.getObject(8)); log.info(vroot.contentToTSVString()); } } } {code} If I don't set the index 7 after the loop, I get all the 0_mtest, 1_mtest, ..., 9_mtest entries. If I set index 7 after the loop, I see 0_mtest, ..., 5_mtest, 6_mtext, 7_new, Before the setRowCount, the data at index 8 is -> *st8_mtest* ; index 9 is *9_mtest* After the setRowCount, the data at index 8 is -> "" ; index 9 is "" With a text with more chars instead of 4 with _new, it keeps eating into the data at the following indices. was: I noticed that calling setSafe on a VarCharVector with indices not in increasing order causes the lastIndex to be set to the index in the last call to setSafe. Is this a documented and expected behavior ? Sample code: {code:java} import java.util.Collections; import lombok.extern.slf4j.Slf4j; import org.apache.arrow.memory.RootAllocator; import org.apache.arrow.vector.VarCharVector; import org.apache.arrow.vector.VectorSchemaRoot; import org.apache.arrow.vector.types.pojo.ArrowType; import org.apache.arrow.vector.types.pojo.Field; import org.apache.arrow.vector.types.pojo.Schema; import org.apache.arrow.vector.util.Text; @Slf4j public class ATest { public static void main() { Schema schema = new Schema(Collections.singletonList(Field.nullable("Data", new ArrowType.Utf8(; try (VectorSchemaRoot vroot = VectorSchemaRoot.create(schema, new RootAllocator())) { VarCharVector vec = (VarCharVector) vroot.getVector("Data"); for (int i = 0; i < 10; i++) { vec.setSafe(i, new Text(Integer.toString(i) + "_mtest")); } // vec.setSafe(0, new Text(Integer.toString(0) + "_new")); vec.setSafe(7, new Text(Integer.toString(7) + "_new")); vroot.setRowCount(10); log.info(vroot.contentToTSVString()); } } } {code} If I don't set the 0 or 7 after the loop, I get all the 0_mtest, 1_mtest, ..., 9_mtest entries. If I set index 0 after the loop, I only see 0_new entry; other entries are "" If I set index 7 after the loop, I see 0_mtest, ..., 5_mtest, 7_new; other entries are "" > [Java] Out of order writes using setSafe > > > Key: ARROW-8909 > URL: https://issues.apache.org/jira/browse/ARROW-8909 > Project: Apache Arrow > Issue Type: Bug > Components: Java >Reporter: Saurabh >Priority: Major > > I noticed that calling setSafe on a VarCharVector with indices not in > increasing order causes the lastIndex to be set to the index in the last call > to setSafe. > Is this a documented and expected behavior ? > Sample code: > {code:java} > import java.util.Collections; > import lombok.extern.slf4j.Slf4j; > import org.apache.arrow.memory.RootAllocator; > import org.apache.arrow.vector.VarCharVector; > import org.apache.arrow.vector.VectorSchemaRoot; > import org.apache.arrow.vector.types.pojo.ArrowType; > import org.apache.arrow.vector.types.pojo.Field; > import org.apache.arrow.vector.types.pojo.Schema; > import org.apache.arrow.vector.util.Text; > @Slf4j > public class ATest { > public static void main() { > Schema schema = new > Schema(Collections.singletonList(Field.nullable("Data", new > ArrowType.Utf8(; > try (VectorSchemaRoot vroot = VectorSchemaRoot.create(schema, new > RootAllocator())) { > VarCharVector vec = (VarCharVector) vroot.getVector("Data"); > for (int i = 0; i < 10; i++) { > vec.setSafe(i, new
[jira] [Updated] (ARROW-8909) Out of order writes using setSafe
[ https://issues.apache.org/jira/browse/ARROW-8909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saurabh updated ARROW-8909: --- Priority: Major (was: Minor) > Out of order writes using setSafe > - > > Key: ARROW-8909 > URL: https://issues.apache.org/jira/browse/ARROW-8909 > Project: Apache Arrow > Issue Type: Bug > Components: Java >Reporter: Saurabh >Priority: Major > > I noticed that calling setSafe on a VarCharVector with indices not in > increasing order causes the lastIndex to be set to the index in the last call > to setSafe. > Is this a documented and expected behavior ? > Sample code: > {code:java} > import java.util.Collections; > import lombok.extern.slf4j.Slf4j; > import org.apache.arrow.memory.RootAllocator; > import org.apache.arrow.vector.VarCharVector; > import org.apache.arrow.vector.VectorSchemaRoot; > import org.apache.arrow.vector.types.pojo.ArrowType; > import org.apache.arrow.vector.types.pojo.Field; > import org.apache.arrow.vector.types.pojo.Schema; > import org.apache.arrow.vector.util.Text; > @Slf4j > public class ATest { > public static void main() { > Schema schema = new > Schema(Collections.singletonList(Field.nullable("Data", new > ArrowType.Utf8(; > try (VectorSchemaRoot vroot = VectorSchemaRoot.create(schema, new > RootAllocator())) { > VarCharVector vec = (VarCharVector) vroot.getVector("Data"); > for (int i = 0; i < 10; i++) { > vec.setSafe(i, new Text(Integer.toString(i) + "_mtest")); > } > // vec.setSafe(0, new Text(Integer.toString(0) + "_new")); > vec.setSafe(7, new Text(Integer.toString(7) + "_new")); > vroot.setRowCount(10); > log.info(vroot.contentToTSVString()); > } > } > } > {code} > > If I don't set the 0 or 7 after the loop, I get all the 0_mtest, 1_mtest, > ..., 9_mtest entries. > If I set index 0 after the loop, I only see 0_new entry; other entries are "" > If I set index 7 after the loop, I see 0_mtest, ..., 5_mtest, 7_new; other > entries are "" > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-8909) Out of order writes using setSafe
[ https://issues.apache.org/jira/browse/ARROW-8909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saurabh updated ARROW-8909: --- Priority: Minor (was: Major) > Out of order writes using setSafe > - > > Key: ARROW-8909 > URL: https://issues.apache.org/jira/browse/ARROW-8909 > Project: Apache Arrow > Issue Type: Bug > Components: Java >Reporter: Saurabh >Priority: Minor > > I noticed that calling setSafe on a VarCharVector with indices not in > increasing order causes the lastIndex to be set to the index in the last call > to setSafe. > Is this a documented and expected behavior ? > Sample code: > {code:java} > import java.util.Collections; > import lombok.extern.slf4j.Slf4j; > import org.apache.arrow.memory.RootAllocator; > import org.apache.arrow.vector.VarCharVector; > import org.apache.arrow.vector.VectorSchemaRoot; > import org.apache.arrow.vector.types.pojo.ArrowType; > import org.apache.arrow.vector.types.pojo.Field; > import org.apache.arrow.vector.types.pojo.Schema; > import org.apache.arrow.vector.util.Text; > @Slf4j > public class ATest { > public static void main() { > Schema schema = new > Schema(Collections.singletonList(Field.nullable("Data", new > ArrowType.Utf8(; > try (VectorSchemaRoot vroot = VectorSchemaRoot.create(schema, new > RootAllocator())) { > VarCharVector vec = (VarCharVector) vroot.getVector("Data"); > for (int i = 0; i < 10; i++) { > vec.setSafe(i, new Text(Integer.toString(i) + "_mtest")); > } > // vec.setSafe(0, new Text(Integer.toString(0) + "_new")); > vec.setSafe(7, new Text(Integer.toString(7) + "_new")); > vroot.setRowCount(10); > log.info(vroot.contentToTSVString()); > } > } > } > {code} > > If I don't set the 0 or 7 after the loop, I get all the 0_mtest, 1_mtest, > ..., 9_mtest entries. > If I set index 0 after the loop, I only see 0_new entry; other entries are "" > If I set index 7 after the loop, I see 0_mtest, ..., 5_mtest, 7_new; other > entries are "" > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8909) Out of order writes using setSafe
Saurabh created ARROW-8909: -- Summary: Out of order writes using setSafe Key: ARROW-8909 URL: https://issues.apache.org/jira/browse/ARROW-8909 Project: Apache Arrow Issue Type: Bug Components: Java Reporter: Saurabh I noticed that calling setSafe on a VarCharVector with indices not in increasing order causes the lastIndex to be set to the index in the last call to setSafe. Is this a documented and expected behavior ? Sample code: {code:java} import java.util.Collections; import lombok.extern.slf4j.Slf4j; import org.apache.arrow.memory.RootAllocator; import org.apache.arrow.vector.VarCharVector; import org.apache.arrow.vector.VectorSchemaRoot; import org.apache.arrow.vector.types.pojo.ArrowType; import org.apache.arrow.vector.types.pojo.Field; import org.apache.arrow.vector.types.pojo.Schema; import org.apache.arrow.vector.util.Text; @Slf4j public class ATest { public static void main() { Schema schema = new Schema(Collections.singletonList(Field.nullable("Data", new ArrowType.Utf8(; try (VectorSchemaRoot vroot = VectorSchemaRoot.create(schema, new RootAllocator())) { VarCharVector vec = (VarCharVector) vroot.getVector("Data"); for (int i = 0; i < 10; i++) { vec.setSafe(i, new Text(Integer.toString(i) + "_mtest")); } // vec.setSafe(0, new Text(Integer.toString(0) + "_new")); vec.setSafe(7, new Text(Integer.toString(7) + "_new")); vroot.setRowCount(10); log.info(vroot.contentToTSVString()); } } } {code} If I don't set the 0 or 7 after the loop, I get all the 0_mtest, 1_mtest, ..., 9_mtest entries. If I set index 0 after the loop, I only see 0_new entry; other entries are "" If I set index 7 after the loop, I see 0_mtest, ..., 5_mtest, 7_new; other entries are "" -- This message was sent by Atlassian Jira (v8.3.4#803005)