[jira] [Created] (ORC-985) ORC branch 1.7 is producing larger files from java writer
Owen O'Malley created ORC-985: - Summary: ORC branch 1.7 is producing larger files from java writer Key: ORC-985 URL: https://issues.apache.org/jira/browse/ORC-985 Project: ORC Issue Type: Bug Components: Java Affects Versions: 1.7.0 Reporter: Owen O'Malley Assignee: Owen O'Malley Running some tests, I noticed a 5% regression in file sizes with branch 1.7 compared to 1.6. I need to track this down. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ORC-984) Create new writer versions for orc 1.7 and 1.8
Owen O'Malley created ORC-984: - Summary: Create new writer versions for orc 1.7 and 1.8 Key: ORC-984 URL: https://issues.apache.org/jira/browse/ORC-984 Project: ORC Issue Type: Improvement Reporter: Owen O'Malley Assignee: Owen O'Malley Currently we can't tell the difference between orc 1.6, 1.7, or 1.8 files. I'd like to introduce a pair of new writer versions that distinguish between them. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [orc] pavibhai opened a new pull request #896: ORC-983 Lowered the log level for some messages related to filter processing from INFO to DEBUG
pavibhai opened a new pull request #896: URL: https://github.com/apache/orc/pull/896 ### What changes were proposed in this pull request? Couple of the log statements related to filter processing have been lowered from INFO to DEBUG level. ### Why are the changes needed? Make the logging less verbose. ### How was this patch tested? Regression testing as there is no functional change in the patch other than the log level. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (ORC-983) Lower the log level of some messages related to filter processing
Pavan Lanka created ORC-983: --- Summary: Lower the log level of some messages related to filter processing Key: ORC-983 URL: https://issues.apache.org/jira/browse/ORC-983 Project: ORC Issue Type: Bug Components: Java Affects Versions: 1.7.0, 1.8.0 Reporter: Pavan Lanka Assignee: Pavan Lanka There are a couple of log statements as part of filter processing that are INFO level, these should be changed to DEBUG level. Status of the `{color:#6a8759}orc.filter.use.selected{color}` and the `determination of filter columns`. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [UPDATE] Apache ORC 1.7.0 Preparation
Thank you, William. BTW, I realized that I made a typo in the title. This thread is for Apache ORC 1.7.0. Sorry for making you confused. Here are additional updates. - Thanks to Yiqun Zhang with ORC-978, Apache ORC 1.7.0 snapshot passed Apache Iceberg Integration Test - Thanks to Pavan Lanka with ORC-980 `Filter processing respects the case-sensitivity flag` was fixed. The only blocker-level JIRA issue is C++ issue. ORC-968: Column names used to build SearchArgument should be full path names Dongjoon. On 2021/09/03 00:19:49, William Hyun wrote: > Thank you for the status update. > > On 2021/09/01 04:46:07, Dongjoon Hyun wrote: > > Hi, All. > > > > Here is 1.7.0 preparation status as of today. > > > > # On-going blocker issues. > > - ORC-968: Column names used to build SearchArgument > > should be full path names > > - ORC-978: Fix NPE in TestFlinkOrcReaderWriter > > > > # Updated umbrella JIRA issues > > - ORC-744 (LazyIO of non-filter columns) is resolved. > > - ORC-731 (Improve `Java Tools`) is resolved. > > - ORC-798 (Add `@since` tag to public interfaces and classes) landed 3 > > patches. > > - ORC-979 (C++ API QA) landed 4 patches but has ORC-968 as a blocker. > > > > # Other resolved blocker issues > > - ORC-965 (Fix ZSTD 'Overflow detected' failure) is fixed at both 1.7.0/ > > 1.6.11. > > > > Best, > > Dongjoon. > > >
[GitHub] [orc] guiyanakuang commented on pull request #895: ORC-982: Extract checkstyle to a single file, help newcomers check code style
guiyanakuang commented on pull request #895: URL: https://github.com/apache/orc/pull/895#issuecomment-912636483 Thank you very much @dongjoon-hyun, thank to review and fix format. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [orc] dongjoon-hyun merged pull request #893: ORC-980: Filter processing respects the case-sensitivity flag
dongjoon-hyun merged pull request #893: URL: https://github.com/apache/orc/pull/893 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [orc] dongjoon-hyun commented on a change in pull request #893: ORC-980: Filter processing respects the case-sensitivity flag
dongjoon-hyun commented on a change in pull request #893: URL: https://github.com/apache/orc/pull/893#discussion_r701992522 ## File path: java/core/src/test/org/apache/orc/TestRowFilteringIOSkip.java ## @@ -570,6 +570,29 @@ public void schemaEvolutionLong2StringColumn() throws IOException { assertEquals(1, rowCount); } + @Test + public void readCaseInsensitive() throws IOException { Review comment: Thank you so much! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [orc] dongjoon-hyun merged pull request #895: ORC-982: Extract checkstyle to a single file, help newcomers check code style
dongjoon-hyun merged pull request #895: URL: https://github.com/apache/orc/pull/895 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [orc] dongjoon-hyun commented on a change in pull request #895: ORC-982: Extract checkstyle to a single file, help newcomers check code style
dongjoon-hyun commented on a change in pull request #895: URL: https://github.com/apache/orc/pull/895#discussion_r701985809 ## File path: java/checkstyle.xml ## @@ -0,0 +1,57 @@ + + +
[GitHub] [orc] pavibhai commented on a change in pull request #893: ORC-980: Filter processing respects the case-sensitivity flag
pavibhai commented on a change in pull request #893: URL: https://github.com/apache/orc/pull/893#discussion_r701973669 ## File path: java/core/src/test/org/apache/orc/TestRowFilteringIOSkip.java ## @@ -570,6 +570,29 @@ public void schemaEvolutionLong2StringColumn() throws IOException { assertEquals(1, rowCount); } + @Test + public void readCaseInsensitive() throws IOException { Review comment: The default value is true for case-sensitivity so all the other tests are case-sensitive tests. I added an explicit failure test with case-sensitivity when the name is not found. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [orc] pavibhai commented on a change in pull request #893: ORC-980: Filter processing respects the case-sensitivity flag
pavibhai commented on a change in pull request #893: URL: https://github.com/apache/orc/pull/893#discussion_r701973669 ## File path: java/core/src/test/org/apache/orc/TestRowFilteringIOSkip.java ## @@ -570,6 +570,29 @@ public void schemaEvolutionLong2StringColumn() throws IOException { assertEquals(1, rowCount); } + @Test + public void readCaseInsensitive() throws IOException { Review comment: The default value is true for case-sensitivity so all the tests are the other tests are case-sensitive tests. I added an explicit failure test with case-sensitivity when the name is not found. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [orc] pavibhai commented on a change in pull request #893: ORC-980: Filter processing respects the case-sensitivity flag
pavibhai commented on a change in pull request #893: URL: https://github.com/apache/orc/pull/893#discussion_r701964316 ## File path: java/core/src/java/org/apache/orc/impl/RecordReaderImpl.java ## @@ -283,6 +283,7 @@ protected RecordReaderImpl(ReaderImpl fileReader, Consumer filterCallBack = null; BatchFilter filter = FilterFactory.createBatchFilter(options, evolution.getReaderBaseSchema(), + evolution.isSchemaEvolutionCaseAware, Review comment: Good point, changed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [orc] guiyanakuang commented on a change in pull request #895: ORC-982: Extract checkstyle to a single file, help newcomers check code style
guiyanakuang commented on a change in pull request #895: URL: https://github.com/apache/orc/pull/895#discussion_r701890984 ## File path: java/checkstyle.xml ## @@ -0,0 +1,57 @@ + + +https://checkstyle.org/dtds/configuration_1_2.dtd;> + + + Review comment: Fix in ec7c471. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [orc] guiyanakuang commented on a change in pull request #895: ORC-982: Extract checkstyle to a single file, help newcomers check code style
guiyanakuang commented on a change in pull request #895: URL: https://github.com/apache/orc/pull/895#discussion_r701823066 ## File path: java/checkstyle.xml ## @@ -0,0 +1,57 @@ + + +https://checkstyle.org/dtds/configuration_1_2.dtd;> + + + Review comment: My IDE's config for xml indentation defaults to 4 spaces. I'll fix this later. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [orc] dongjoon-hyun commented on a change in pull request #893: ORC-980: Filter processing respects the case-sensitivity flag
dongjoon-hyun commented on a change in pull request #893: URL: https://github.com/apache/orc/pull/893#discussion_r701812550 ## File path: java/core/src/test/org/apache/orc/TestRowFilteringIOSkip.java ## @@ -570,6 +570,29 @@ public void schemaEvolutionLong2StringColumn() throws IOException { assertEquals(1, rowCount); } + @Test + public void readCaseInsensitive() throws IOException { Review comment: Do we have a case-sensitive test coverage? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [orc] dongjoon-hyun commented on a change in pull request #893: ORC-980: Filter processing respects the case-sensitivity flag
dongjoon-hyun commented on a change in pull request #893: URL: https://github.com/apache/orc/pull/893#discussion_r701811596 ## File path: java/core/src/java/org/apache/orc/impl/RecordReaderImpl.java ## @@ -283,6 +283,7 @@ protected RecordReaderImpl(ReaderImpl fileReader, Consumer filterCallBack = null; BatchFilter filter = FilterFactory.createBatchFilter(options, evolution.getReaderBaseSchema(), + evolution.isSchemaEvolutionCaseAware, Review comment: For consistency, could you use the getter function which is added by this PR? ``` public boolean isSchemaEvolutionCaseAware() ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [orc] dongjoon-hyun commented on a change in pull request #893: ORC-980: Filter processing respects the case-sensitivity flag
dongjoon-hyun commented on a change in pull request #893: URL: https://github.com/apache/orc/pull/893#discussion_r701811596 ## File path: java/core/src/java/org/apache/orc/impl/RecordReaderImpl.java ## @@ -283,6 +283,7 @@ protected RecordReaderImpl(ReaderImpl fileReader, Consumer filterCallBack = null; BatchFilter filter = FilterFactory.createBatchFilter(options, evolution.getReaderBaseSchema(), + evolution.isSchemaEvolutionCaseAware, Review comment: For consistency, could you use the getter function? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [orc] dongjoon-hyun commented on a change in pull request #895: ORC-982: Extract checkstyle to a single file, help newcomers check code style
dongjoon-hyun commented on a change in pull request #895: URL: https://github.com/apache/orc/pull/895#discussion_r701808648 ## File path: java/checkstyle.xml ## @@ -0,0 +1,57 @@ + + +https://checkstyle.org/dtds/configuration_1_2.dtd;> + + + Review comment: Shall we use two-space indentation? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [orc] guiyanakuang opened a new pull request #895: ORC-982: Extract checkstyle to a single file, help newcomers check code style
guiyanakuang opened a new pull request #895: URL: https://github.com/apache/orc/pull/895 ### What changes were proposed in this pull request? Extract checkstyle to a single file. Added tips to coding.md. ### Why are the changes needed? [CheckStyle-IDEA](https://plugins.jetbrains.com/plugin/1065-checkstyle-idea) plugin is very simple to load this checkstyle.xml. This way you get checkstyle errors/warnings already when you are coding. ![image](https://user-images.githubusercontent.com/4069905/131971923-a08b9520-2a9d--844f-a5e3e1396e57.png) ### How was this patch tested? Pass the CIs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (ORC-982) Extract checkstyle to a single file, help newcomers check code style
Yiqun Zhang created ORC-982: --- Summary: Extract checkstyle to a single file, help newcomers check code style Key: ORC-982 URL: https://issues.apache.org/jira/browse/ORC-982 Project: ORC Issue Type: Improvement Components: Java Affects Versions: 1.8.0 Reporter: Yiqun Zhang Fix For: 1.8.0 Attachments: screenshot-1.png Extract checkstyle to a single file, help newcomers check code style. [CheckStyle-IDEA|https://plugins.jetbrains.com/plugin/1065-checkstyle-idea] plugin is very simple to load this checkstyle.xml. This way you get checkstyle errors/warnings already when you are coding. -- This message was sent by Atlassian Jira (v8.3.4#803005)