This is an automated email from the ASF dual-hosted git repository. kgyrtkirk pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git
The following commit(s) were added to refs/heads/master by this push: new 0099b14aa6a HIVE-25733: Add check-spelling/check-spelling (#2809) (Josh Soref reviewed by Zoltan Haindrich) 0099b14aa6a is described below commit 0099b14aa6a50d4470b057e93a95a7391b74add7 Author: Josh Soref <2119212+jso...@users.noreply.github.com> AuthorDate: Mon Jun 13 11:05:41 2022 -0400 HIVE-25733: Add check-spelling/check-spelling (#2809) (Josh Soref reviewed by Zoltan Haindrich) --- .github/actions/spelling/README.md | 17 ++ .github/actions/spelling/advice.md | 25 ++ .github/actions/spelling/allow.txt | 0 .github/actions/spelling/excludes.txt | 57 +++++ .github/actions/spelling/expect.txt | 449 ++++++++++++++++++++++++++++++++++ .github/actions/spelling/only.txt | 1 + .github/actions/spelling/patterns.txt | 38 +++ .github/actions/spelling/reject.txt | 7 + .github/workflows/spelling.yml | 69 ++++++ 9 files changed, 663 insertions(+) diff --git a/.github/actions/spelling/README.md b/.github/actions/spelling/README.md new file mode 100644 index 00000000000..749294b33fb --- /dev/null +++ b/.github/actions/spelling/README.md @@ -0,0 +1,17 @@ +# check-spelling/check-spelling configuration + +File | Purpose | Format | Info +-|-|-|- +<!-- +[dictionary.txt](dictionary.txt) | Replacement dictionary (creating this file will override the default dictionary) | one word per line | [dictionary](https://github.com/check-spelling/check-spelling/wiki/Configuration#dictionary) +--> +[allow.txt](allow.txt) | Add words to the dictionary | one word per line (only letters and `'`s allowed) | [allow](https://github.com/check-spelling/check-spelling/wiki/Configuration#allow) +[reject.txt](reject.txt) | Remove words from the dictionary (after allow) | grep pattern matching whole dictionary words | [reject](https://github.com/check-spelling/check-spelling/wiki/Configuration-Examples%3A-reject) +[excludes.txt](excludes.txt) | Files to ignore entirely | perl regular expression | [excludes](https://github.com/check-spelling/check-spelling/wiki/Configuration-Examples%3A-excludes) +[only.txt](only.txt) | Only check matching files (applied after excludes) | perl regular expression | [only](https://github.com/check-spelling/check-spelling/wiki/Configuration-Examples%3A-only) +[patterns.txt](patterns.txt) | Patterns to ignore from checked lines | perl regular expression (order matters, first match wins) | [patterns](https://github.com/check-spelling/check-spelling/wiki/Configuration-Examples%3A-patterns) +[expect.txt](expect.txt) | Expected words that aren't in the dictionary | one word per line (sorted, alphabetically) | [expect](https://github.com/check-spelling/check-spelling/wiki/Configuration#expect) +[advice.md](advice.md) | Supplement for GitHub comment when unrecognized words are found | GitHub Markdown | [advice](https://github.com/check-spelling/check-spelling/wiki/Configuration-Examples%3A-advice) + +Note: you can replace any of these files with a directory by the same name (minus the suffix) +and then include multiple files inside that directory (with that suffix) to merge multiple files together. diff --git a/.github/actions/spelling/advice.md b/.github/actions/spelling/advice.md new file mode 100644 index 00000000000..c83423a8ef6 --- /dev/null +++ b/.github/actions/spelling/advice.md @@ -0,0 +1,25 @@ +<!-- See https://github.com/check-spelling/check-spelling/wiki/Configuration-Examples%3A-advice --> <!-- markdownlint-disable MD033 MD041 --> +<details><summary>If the flagged items do not appear to be text</summary> + +If items relate to a ... +* well-formed pattern. + + If you can write a [pattern](https://github.com/check-spelling/check-spelling/wiki/Configuration-Examples:-patterns) that would match it, + try adding it to the `patterns.txt` file. + + Patterns are Perl 5 Regular Expressions - you can [test]( +https://www.regexplanet.com/advanced/perl/) yours before committing to verify it will match your lines. + + Note that patterns can't match multiline strings. + +* binary file. + + Please add a file path to the `excludes.txt` file matching the containing file. + + File paths are Perl 5 Regular Expressions - you can [test]( +https://www.regexplanet.com/advanced/perl/) yours before committing to verify it will match your files. + + `^` refers to the file's path from the root of the repository, so `^README\.md$` would exclude [README.md]( +../tree/HEAD/README.md) (on whichever branch you're using). + +</details> diff --git a/.github/actions/spelling/allow.txt b/.github/actions/spelling/allow.txt new file mode 100644 index 00000000000..e69de29bb2d diff --git a/.github/actions/spelling/excludes.txt b/.github/actions/spelling/excludes.txt new file mode 100644 index 00000000000..f15a0e0c9ca --- /dev/null +++ b/.github/actions/spelling/excludes.txt @@ -0,0 +1,57 @@ +# See https://github.com/check-spelling/check-spelling/wiki/Configuration-Examples:-excludes +(?:^|/)(?i)COPYRIGHT +(?:^|/)(?i)LICEN[CS]E +(?:^|/)package(?:-lock|)\.json$ +(?:^|/)vendor/ +ignore$ +LICENSE +\.avi$ +\.avro$ +\.bz2$ +\.deflate$ +\.eot$ +\.gif$ +\.gz$ +\.ico$ +\.jar$ +\.jceks$ +\.jks$ +\.jpe?g$ +\.jpeg$ +\.jpg$ +\.keep$ +\.lock$ +\.log$ +\.map$ +\.min\. +\.min\.. +\.mod$ +\.mp[34]$ +\.orc$ +\.out$ +\.out\. +\.parq$ +\.parquet$ +\.png$ +\.q$ +\.rc$ +\.rcfile$ +\.seq$ +\.svg$ +\.ttf$ +\.wav$ +\.woff$ +^data/files +^data/scripts/q_test_cleanup\.sql$ +^docs/people\.md$ +^errata\.txt$ +^hplsql/src/test/queries/local/to_timestamp\.sql$ +^iceberg-handler/checkstyle/checkstyle-suppressions\.xml$ +^itests/hive-unit/src/test/resources/simple-saml-idp-metadata-template\.xml$ +^llap-common/src/gen/protobuf/gen-java/org/apache/hadoop/hive/llap/daemon/rpc/LlapDaemonProtocolProtos\.java$ +^llap-common/src/gen/protobuf/gen-java/org/apache/hadoop/hive/llap/plugin/rpc/LlapPluginProtocolProtos\.java$ +^serde/src/test/resources/json/single_pixel\.json$ +^spark-client/src/test/java/org/apache/hive/spark/client/TestSparkClient\.java$ +^testutils/ptest2/src/test/resources/ +^\.github/ +^\Qql/src/test/resources/hsmm/hsmm_cfg_01.yaml\E$ diff --git a/.github/actions/spelling/expect.txt b/.github/actions/spelling/expect.txt new file mode 100644 index 00000000000..39b9d7fc583 --- /dev/null +++ b/.github/actions/spelling/expect.txt @@ -0,0 +1,449 @@ +AAAAABJRU +aarry +abcd +abcde +abcdef +abcdefg +abcdefgh +abcdefghij +abcdefghijklmnopqrstuvwxyz +abyte +adouble +aeden +afields +AFq +agrw +aloi +alphadigits +amap +ANANCIENTBLUEBOX +ANull +anullint +anullstring +aoig +arecord +args +arraycopy +asd +ASF +ashort +astring +ati +attr +aunion +Autogenerated +avro +avrotest +avsc +bais +baoi +baos +BArray +basedir +bdoi +bdw +bigint +binarysortable +bitfield +bitset +blahblahblah +bnoi +boi +BTEQ +buf +byoi +bytearrayinput +bytebuffer +bytecode +byteinfo +BYTEINT +byw +cae +cfields +charset +classname +clazz +Cloneable +cmp +CODEPOINT +coi +columnset +complexpb +concat +concating +config +cpp +Cpy +CReadable +crlf +csv +ctl +ctor +ctype +cwiki +CYBERMEN +DALEKS +daos +datetime +ddl +decs +Deque +deser +deserialization +deserialize +deserialized +deserializer +deserializing +dest +dfr +dfs +dfw +Dio +doi +DOTALL +DOUBLEVALUE +dti +dtoi +dtype +DZone +ecapseman +ecast +Eccleston +eid +ele +elif +endian +endif +entryinput +entryoutput +enum +enumlist +enumness +enumset +eoi +EQVR +Escapables +ESCAPECHAR +ETX +etype +facebook +fastload +ffec +fff +fieldids +Fieldname +firstfield +foi +foreach +fourbytes +frac +ftype +gallifrey +garw +gdr +gdw +gmt +hackersdelight +hadoop +hamcrest +hashcode +Hashtable +hbase +hcoi +hconf +hcw +HDFS +hdoi +hdw +HIVEFETCHOUTPUTSERDE +hiw +href +http +hvc +hvoi +iae +IBegin +identd +idx +ifndef +ILength +impl +inited +inlining +inp +INPATH +inputformat +inputmpkeyoi +inputmpoi +inputmpvalueoi +inputstream +instanceof +interop +Ints +INTVALUE +ioe +ioi +iosfwd +iprot +IScheme +isdigit +isinstance +islist +isset +itest +ith +JAGRAFESS +javabean +Javadoc +javax +jdbc +jdm +Jggg +Joda +json +junit +kitchsink +ktype +kyro +lasti +lazybinary +lazydio +lazympkeyoi +lazympoi +lazympvalueoi +lazyobj +LAZYSIMPLE +LBRACE +LBRACKET +len +LINTSTRING +loi +LONGVALUE +lstring +ltype +mapkey +mapsize +MAXVALUE +mctesty +Mega +megastruct +memoizes +memoizing +metadata +metastore +millis +MINVALUE +mkdirs +mkoi +mstring +MSTRINGSTRING +mti +mtype +mvoi +mydouble +myint +mylong +mysql +MYSTRING +namespace +nano +narray +NByte +Nestedin +nfe +nio +noexcept +nokey +nondigits +nullability +nullableenum +nullableint +nullptr +nullsafe +nullstring +NUu +objectinspector +objs +ois +olist +omap +opencsv +oprot +ostream +outputmp +outputstream +params +php +plugin +png +println +prj +prot +protobuf +protoc +pti +ptr +ptype +QLFTWHXe +QUOTECHAR +RAWSTRING +rawtypes +RBRACE +RBRACKET +rdata +readcolumn +recv +reencode +reencoding +regex +RESULTSET +returncode +rhs +rmi +rossa +RRID +sde +SEPARATORCHAR +sequencefile +serde +serdeing +SERDEPROPERTIES +serializer +shiftbits +SInt +SINTSTRING +SJAAAADUl +slist +SLITHEEN +smallint +soi +spoi +sql +src +ssoi +stackoverflow +sti +stringify +stringlist +stringset +STRINGVALUE +strtod +structlist +structs +structset +STX +stype +subrecord +superfield +sys +TApplication +TBase +TBinary +tbl +tblproperties +TBool +TByte +TColumn +TCompact +TConfiguration +TConstant +TCTL +TDouble +TEnum +teradata +testcompare +testfield +testget +testi +testthrift +TException +TEXTFILE +TField +TFrozen +threebytes +Throwable +timelabel +timestamplocal +TIMESTAMPLOCALTZ +timestamptz +timezone +TInput +tinyint +TIO +tlist +tmap +TMemory +TMessage +tmp +todo +tokenized +tokenizer +tokenizing +TOutput +tprotocol +TRecursive +TReflection +TRegex +tsd +TSet +tsoi +TSOP +TSPEC +TString +tstruct +tstz +tsw +TTo +ttransport +TTuple +ttype +TUnion +typedef +typeinfo +typestring +udf +uint +ULong +uncompressing +unescape +unicode +Uniob +UNIONFIELD +UNIONMSTRINGSTRING +uniontype +uoi +uri +urie +url +usoi +utf +Utils +utype +uuid +vals +varbyte +varchar +varname +vchar +versioned +vidt +vints +viter +viyt +vlong +voi +vtype +wiki +workaround +writables +www +XDh +xfer +xml +xmlns +xsi +yadda +yes'okay +yyyy +YYYYMMDD +zid diff --git a/.github/actions/spelling/only.txt b/.github/actions/spelling/only.txt new file mode 100644 index 00000000000..a1dd009c742 --- /dev/null +++ b/.github/actions/spelling/only.txt @@ -0,0 +1 @@ +^serde/ diff --git a/.github/actions/spelling/patterns.txt b/.github/actions/spelling/patterns.txt new file mode 100644 index 00000000000..0600af057b7 --- /dev/null +++ b/.github/actions/spelling/patterns.txt @@ -0,0 +1,38 @@ +# See https://github.com/check-spelling/check-spelling/wiki/Configuration-Examples:-patterns + +# stackexchange -- https://stackexchange.com/feeds/sites +\b(?:askubuntu|serverfault|stack(?:exchange|overflow)|superuser).com/questions/\d+/[a-z-]+ +# w3 +\bw3\.org/[-0-9a-zA-Z/#.]+ +# mvnrepository.com +\bmvnrepository\.com/[-0-9a-z./]+ +# URL escaped characters +\%[0-9A-F]{2} +# sha-1 +"[0-9a-f]{40}" +# hex digits including css/html color classes: +(?:[\\0][xX]|\\u|[uU]\+|#|\%23)[0-9a-fA-FgGrR]{2,}[uU]?[lL]{0,2}\b +# uuid: +[{"'][0-9a-fA-F]{8}-(?:[0-9a-fA-F]{4}-){3}[0-9a-fA-F]{12}['"}] +# curl arguments +\b(?:)curl(?:\s+-[a-zA-Z]+)+ +# tar arguments +\b(?:)tar(?:\s+-[a-zA-Z]+|\s[a-z]+)+ +# decode +\.decode\("[0-9a-zA-Z]+" +# +"[0-9a-f]{80,}" +# pom.xml +<(additionalClasspathElement|arg|(?:artifact|group)Id|id|installDir|jvmArgs|mainClass|shadedClassifierName|template(?:Base|Source|Output)Dir|version)>[^<]*</\1> +\b(?:(?:build|template(?:Base|Source|Output))Dir|(?:to|)dir|location)="[^"]*" +xsi:schemaLocation="[^"]*" +# java +\bimport (?:com|org)\.[a-z0-9.]*\b +# ql/src/gen/protobuf/gen-java/org/apache/hadoop/hive/ql/hooks/proto/HiveHookEvents.java +016Hiv" +# serde/src/java/org/apache/hadoop/hive/serde2/avro/SchemaToTypeInfo.java +\bNULLable\b +# serde/src/test/org/apache/hadoop/hive/serde2/lazy/fast/TestLazySimpleDeserializeRead.java +"This\\{1,2}n..*" +# ignore long runs of a single character: +\b([A-Za-z])\g{-1}{3,}\b diff --git a/.github/actions/spelling/reject.txt b/.github/actions/spelling/reject.txt new file mode 100644 index 00000000000..a5ba6f6390e --- /dev/null +++ b/.github/actions/spelling/reject.txt @@ -0,0 +1,7 @@ +^attache$ +benefitting +occurence +Sorce +^[Ss]pae +^untill +^wether diff --git a/.github/workflows/spelling.yml b/.github/workflows/spelling.yml new file mode 100644 index 00000000000..16f6f75333e --- /dev/null +++ b/.github/workflows/spelling.yml @@ -0,0 +1,69 @@ +name: Spell checking +on: + push: + branches: ["**"] + tags-ignore: ["**"] + pull_request_target: + +jobs: + spelling: + name: Spell checking + permissions: + contents: read + pull-requests: read + outputs: + internal_state_directory: ${{ steps.spelling.outputs.internal_state_directory }} + runs-on: ubuntu-latest + if: "contains(github.event_name, 'pull_request') || github.event_name == 'push'" + concurrency: + group: spelling-${{ github.event.pull_request.number || github.ref }} + # note: If you use only_check_changed_files, you do not want cancel-in-progress + cancel-in-progress: true + steps: + - name: checkout-merge + if: "contains(github.event_name, 'pull_request')" + uses: actions/checkout@v2 + with: + ref: refs/pull/${{github.event.pull_request.number}}/merge + - name: checkout + if: github.event_name == 'push' + uses: actions/checkout@v2 + - name: check-spelling + id: spelling + uses: check-spelling/check-spelling@v0.0.20-alpha3 + with: + suppress_push_for_open_pull_request: 1 + post_comment: 0 + - name: store-comment + if: failure() + uses: actions/upload-artifact@v2 + with: + retention-days: 1 + name: "check-spelling-comment-${{ github.run_id }}" + path: | + ${{ steps.spelling.outputs.internal_state_directory }} + + comment: + name: Comment + runs-on: ubuntu-latest + needs: spelling + permissions: + contents: write + pull-requests: write + if: always() && needs.spelling.result == 'failure' && needs.spelling.outputs.internal_state_directory + steps: + - name: checkout + uses: actions/checkout@v2 + - name: set up + run: | + mkdir /tmp/data + - name: retrieve-comment + uses: actions/download-artifact@v2 + with: + name: "check-spelling-comment-${{ github.run_id }}" + path: /tmp/data + - name: comment + uses: check-spelling/check-spelling@v0.0.20-alpha3 + with: + custom_task: comment + internal_state_directory: /tmp/data