Hi, On 2021-04-27, 22:21 +0200, Nicolas Goaziou <[email protected]> wrote:
>> + When using org-table-import interactively if we failed to guess
>> separator then we will be left with a user-error message and an
>> 'unconverted table'. We can make use of 'temp-buffer' to import our
>> file after successfully conversion.
>
> I'm not sure to understand what you mean.
Note: I will advice you to apply patch no. 2 before trying out the
following example.
1. Download the attached CSV file. We can call this example.csv
2. Go to *scratch* buffer.
3. Use 'M-x org-table-import' to import example.csv as org-table.
You will see even thought org-table-guess-separator failed in guessing
separator we are still left with unconverted region added to our buffer.
>> + Conversion part of org-table-convert-region make a distinction between
>> '(4) (comma separator) and rest of the separator we should either string
>> version of comma as AND condition or rewrite to simplify it.
>
> Ditto. But it can be the object of another patch. Let's concentrate on
> `org-table-guess-separator' first.
>
>> I am willing to do these possible changes but currently waiting for your
>> review for org-table-guess-separator as there can be more serious bugs
>> lurking around on my code which I am considering base for these
>> changes.
>
> You should definitely write tests for this function. Here's a start:
>
> (ert-deftest test-org-table/guess-separator ()
> "Test `test-org-table/guess-separator'."
> ;; Test space separator.
> (should
> (equal " "
> (org-test-with-temp-text "a b\nc d"
> (org-table-guess-separator (point-min) (point-max)))))
> (should
> (equal " "
> (org-test-with-temp-text "a b\nc d"
> (org-table-guess-separator (point-min) (point-max)))))
> ;; Test "inverted" region.
> (should
> (equal " "
> (org-test-with-temp-text "a b\nc d"
> (org-table-guess-separator (point-max) (point-min)))))
> ;; Do not error on empty region.
> (should-not
> (org-test-with-temp-text ""
> (org-table-guess-separator (point-max) (point-min))))
> (should-not
> (org-test-with-temp-text " \n"
> (org-table-guess-separator (point-max) (point-min)))))
>
I will surely do more testing.
I would also like to simplify the condition for guessing SPACE as
separator due to following cases:
+ field1 'this is field2' 'this is field3' :: In this case we still have
SPACE inside quote (' in this case).
+ Since SPACE is our last valid separator I think searching for a line
which doesn't contains space is more than enough.
Required patch:
>From 6b112927de73c43edfd08254217808ebff42772a Mon Sep 17 00:00:00 2001 From: Utkarsh Singh <[email protected]> Date: Wed, 28 Apr 2021 10:26:46 +0530 Subject: [PATCH 1/3] org-table.el (org-table-import): add yes-and-no prompt Add a yes and no prompt for files which don't have .txt, .tsv OR .csv as file extensions. --- lisp/org/org-table.el | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/lisp/org/org-table.el b/lisp/org/org-table.el index 0e93fb271f..e0b2be6892 100644 --- a/lisp/org/org-table.el +++ b/lisp/org/org-table.el @@ -938,7 +938,8 @@ org-table-import - regexp When a regular expression, use it to match the separator." (interactive "f\nP") (when (and (called-interactively-p 'any) - (not (string-match-p (rx "." (or "txt" "tsv" "csv") eos) file))) + (not (string-match-p (rx "." (or "txt" "tsv" "csv") eos) file)) + (not (yes-or-no-p "File does not have .txt, .tsv or .csv as extension. Do you still want to continue? "))) (user-error "Cannot import such file")) (unless (bolp) (insert "\n")) (let ((beg (point)) -- 2.31.1
>From 9bb017cfc8284075e04faf5496ed560ba48d5bbc Mon Sep 17 00:00:00 2001 From: Utkarsh Singh <[email protected]> Date: Wed, 28 Apr 2021 10:42:32 +0530 Subject: [PATCH 2/3] org-table.el (org-table-convert-region): move out separator-guessing 1. Move separator guessing code to org-table-guess-separator (new function). 2. Add semicolon, colon and SPACE to the list of know separator (separator which we can guess). --- lisp/org/org-table.el | 49 +++++++++++++++++++++++++++++++++---------- 1 file changed, 38 insertions(+), 11 deletions(-) diff --git a/lisp/org/org-table.el b/lisp/org/org-table.el index e0b2be6892..295f7a9b90 100644 --- a/lisp/org/org-table.el +++ b/lisp/org/org-table.el @@ -846,6 +846,39 @@ org-table-create (goto-char pos)) (org-table-align))) +(defun org-table-guess-separator (beg0 end0) + "Guess separator for region BEG0 to END0. + +List of preferred separator (in order of preference): +comma, TAB, semicolon, colon or SPACE. + +Search for a line which doesn't contain a separator if found +search again using next preferred separator or else return +separator as string." + (let* ((beg (save-excursion + (goto-char (min beg0 end0)) + (skip-chars-forward " \t\n") + (if (eobp) (point) (line-beginning-position)))) + (end (save-excursion + (goto-char (max beg0 end0)) + (skip-chars-backward " \t\n" beg) + (if (= beg (point)) (point) (line-end-position)))) + (sep-regexp + (list (list "," (rx bol (1+ (not (or ?\n ?,))) eol)) + (list "\t" (rx bol (1+ (not (or ?\n ?\t))) eol)) + (list ";" (rx bol (1+ (not (or ?\n ?\;))) eol)) + (list ":" (rx bol (1+ (not (or ?\n ?:))) eol)) + (list " " (rx bol (1+ (not (or ?\n ?\s))) eol))))) + (unless (= beg end) + (save-excursion + (goto-char beg) + (catch :found + (pcase-dolist (`(,sep ,regexp) sep-regexp) + (save-excursion + (unless (re-search-forward regexp end t) + (throw :found sep)))) + nil))))) + ;;;###autoload (defun org-table-convert-region (beg0 end0 &optional separator) "Convert region to a table. @@ -862,10 +895,7 @@ org-table-convert-region integer When a number, use that many spaces, or a TAB, as field separator regexp When a regular expression, use it to match the separator nil When nil, the command tries to be smart and figure out the - separator in the following way: - - when each line contains a TAB, assume TAB-separated material - - when each line contains a comma, assume CSV material - - else, assume one or more SPACE characters as separator." + separator using `org-table-guess-seperator'." (interactive "r\nP") (let* ((beg (min beg0 end0)) (end (max beg0 end0)) @@ -882,13 +912,10 @@ org-table-convert-region (if (bolp) (backward-char 1) (end-of-line 1)) (setq end (point-marker)) ;; Get the right field separator - (unless separator - (goto-char beg) - (setq separator - (cond - ((not (re-search-forward "^[^\n\t]+$" end t)) '(16)) - ((not (re-search-forward "^[^\n,]+$" end t)) '(4)) - (t 1)))) + (when (and (not separator) + (not (setq separator + (org-table-guess-separator beg end)))) + (user-error "Failed to guess separator")) (goto-char beg) (if (equal separator '(4)) (while (< (point) end) -- 2.31.1
>From fef97ffe27ff908647c45f1b066a845e71a0926f Mon Sep 17 00:00:00 2001 From: Utkarsh Singh <[email protected]> Date: Wed, 28 Apr 2021 14:01:31 +0530 Subject: [PATCH 3/3] org-table.el (org-table-import): add file prompt --- lisp/org/org-table.el | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/lisp/org/org-table.el b/lisp/org/org-table.el index 295f7a9b90..e904903576 100644 --- a/lisp/org/org-table.el +++ b/lisp/org/org-table.el @@ -963,7 +963,8 @@ org-table-import - (64) Prompt for a regular expression as field separator. - integer When a number, use that many spaces, or a TAB, as field separator. - regexp When a regular expression, use it to match the separator." - (interactive "f\nP") + (interactive (list (read-file-name "Import file: ") + (prefix-numeric-value current-prefix-arg))) (when (and (called-interactively-p 'any) (not (string-match-p (rx "." (or "txt" "tsv" "csv") eos) file)) (not (yes-or-no-p "File does not have .txt, .tsv or .csv as extension. Do you still want to continue? "))) -- 2.31.1
example.csv
Description: csv file
-- Utkarsh Singh http://utkarshsingh.xyz
