Re: official orgmode parser

2020-11-28 Thread Gerry Agbobada
Hello,

On Wed, Nov 11, 2020, at 10:15, Bastien wrote:
> 
> The example file would be also good to help users track for small
> syntactic changes, when they happen.
> 
> 

When I thought mistakenly I could use an EBNF parser to parse Org-mode, I wrote 
a little examples to get going (never went past headings as I'm not really good 
with parsing things) 
https://github.com/gagbo/LuaOrgParser/tree/master/tests/test-files/headings

Maybe it could be used as a base. I wasn't really sure of how to handle test 
cases and creating good ones.

Best regards,


Gerry Agbobada


Re: official orgmode parser

2020-11-12 Thread Tom Gillespie
Hi Bastien,
 I agree it would be great to ask them to contribute to whichever
ruby library they are using. I will see if I can get in touch, but I
have no idea of where to start if we really want to get to the folks
who could make a decision. It looks like gitlab uses the same org-ruby
library as well
https://gitlab.com/gitlab-org/gitlab-foss/-/blob/master/Gemfile#L156.
They may be easier to reach out to. I have also cced Wally to see if
he has any insights here. Best!
Tom



Re: official orgmode parser

2020-11-11 Thread Daniele Nicolodi
On 11/11/2020 10:15, Bastien wrote:
> Hi Daniele,
> 
> Daniele Nicolodi  writes:
> 
>> Would it make sense to have one "official" (or a set of) org-mode test
>> files and the corresponding syntax tree as parsed by org-elements (maybe
>> in a format easier to read from other programming languages than
>> s-expressions, json maybe?) to make testing other parser against the
>> reference implementation easier?
> 
> I think it is a very good idea.
> 
> The example file would be also good to help users track for small
> syntactic changes, when they happen.
> 
> Would you like to work on such a file?

I don't have enough motivation to see this climb high enough in my TODO
list to see any meaningful progress in a reasonable time frame.  I am
mote than happy to contribute to Org, but it is more effective to keep
these contributions related to my daily use of Org.

Cheers,
Dan



Re: official orgmode parser

2020-11-11 Thread Bastien
Hi Daniele,

Daniele Nicolodi  writes:

> Would it make sense to have one "official" (or a set of) org-mode test
> files and the corresponding syntax tree as parsed by org-elements (maybe
> in a format easier to read from other programming languages than
> s-expressions, json maybe?) to make testing other parser against the
> reference implementation easier?

I think it is a very good idea.

The example file would be also good to help users track for small
syntactic changes, when they happen.

Would you like to work on such a file?

-- 
 Bastien



Re: official orgmode parser

2020-11-11 Thread Bastien
Hi Tom,

Tom Gillespie  writes:

>> which Ruby org-mode parser does Github use?
>
> I'm pretty sure that github uses https://github.com/wallyqs/org-ruby.
> It is ... not compliant, shall we say. I have making some fixes to the
> footnote parsing section on my todo list, but I don't expect to get to
> it any time in the near future.

Can you contact GitHub and see what they use?

Whatever they use, I suggest we ask them to support the org library
they use to let their users display Org files.

Maybe the same should be done with gitlab.com, since they also parse
Org files somehow.

-- 
 Bastien



Re: official orgmode parser

2020-11-11 Thread Bastien
Hi Ken,

Ken Mankoff  writes:

> Yes, I meant to write that I think Org syntax is maybe *not*
> context-free, and therefore EBNF can't capture all of it. But it could
> still be very helpful and capture most of it.

Perhaps.  Or you willing to give it a try and report here?

-- 
 Bastien



Re: official orgmode parser

2020-11-11 Thread Bastien
Hi Ken,

Ken Mankoff  writes:

> On 2020-10-26 at 09:24 -07, Nicolas Goaziou  wrote...
>> # This is a comment (1)
>>
>> #+begin_example
>> # This is not a comment (2)
>> #+end_example
>>
>> AFAICT, you cannot distinguish between lines (1) and (2) with EBNF.
>
> I agree. I think this is a better (correct?) example than the
> footnotes on Org Syntax page.

Can you suggest a patch?

-- 
 Bastien



Re: official orgmode parser

2020-11-11 Thread Bastien
Hi Sébastien,

rey-coyrehourcq  writes:

> Some partial org Parsers (AST or regex...) i found on the web for a
> recent state of the art : 

Thanks -- I've updated https://orgmode.org/worg/org-tools/ with this
information. 

Best,

-- 
 Bastien



Re: official orgmode parser

2020-10-26 Thread Przemysław Kamiński
I'm no expert in parsing but I would expect org's parser to be quite 
similar to the multitude of markdown or CommonMark [1] parsers. There 
isn't that much difference in syntax, except maybe org is more versatile 
and has more syntax elements, like drawers.


Searching for "EBNF Markdown" I stumbled upon [2].

[1] https://commonmark.org/
[2] http://roopc.net/posts/2014/markdown-cfg/

On 10/26/20 10:00 PM, Tom Gillespie wrote:

Here is an attempt to clarify my own confusion around the nested
structures in org. In short: each node in the headline tree and the
plain list tree can be parse using the EBNF, the nesting level cannot,
which means that certain useful operations such as folding, require
additional rules beyond the grammar. More in line. Best!
Tom


Do you need to? This is valid as an entire Org file, I think:

*** foo
* bar
* baz

And that can be represented in EBNF. I'm not aware of places where behavior is 
indent-level specific, except inline tasks, and that edge case can be 
represented.


You are correct, and as long as the heading depth doesn't change some
interpretation then this is a non-issue. The reason I mentioned this
though is
because it means that you cannot determine how to correctly fold an
org file from the grammar alone.

To make sure I understand. It is possible to determine the number of
leading stars (and thus the level), but I think that it is not
possible to identify the end of a section.
For example

* a
*** b
** c
* d

You can parse out a 1, b 3, c 2, d 1, but if you want to be able to
nest b and c inside a but not nest d inside a, then you need a stack
in there somewhere. You
can't have a rule such as

section : headline content
content : text | section

because the parse would incorrectly nest sections at the same level,
you would have to write

section-level-1 : headline-1 content-1
content-1 : text | section-level-2-n

but since we have an arbitrary number of levels the grammar would have
to be infinite.
This is only if you want your grammar to be able to encode that the
content of sections
can include other more deeply nested sections, which in this context
we almost certainly
do not (as you point out).


There is a similar issue with the indentation level in
order to correctly interpret plain lists.


list ::= ('+' string newline)+ sublist?
sublist ::= (indent list)+

I think this captures lists?


Ah yes, I see my mistake here. In order for this to work the parser
has to implement significant whitespace,
so whitespace cannot be parsed into a single token. I think everything
works out after that.


Definitely not able to be represented in EBNF, unless as you say {name} is a 
limited vocabulary.


Darn those pesky open sets!






Re: official orgmode parser

2020-10-26 Thread Tom Gillespie
Even if this did work for plain lists it won't work for headlines
because headlines have an arbitrary number of stars and thus it is not
possible for the grammar to know what is a sub-headline vs "the next
headline". For a similar reason I'm fairly sure that the sublist
approach will not work due to issues with relative indent. Here is the
quote from the current draft syntax.

> An item ends before the next item, the first line less or equally indented
> than its starting line, or two consecutive empty lines. Indentation of lines
> within other greater elements do not count, neither do inlinetasks boundaries.

The "the first line less or equally indented than its starting line"
section is what prevents your approach from working because you have
to know the relative indentation in order to figure out which list
contains a nested list. As written your grammar will parse a nested
list into a flat list. This is because there are an arbitrary number
distinct tokens that could be =indent= in your grammar and the EBNF
can't specify an ordering for them so that you can't say that one
indent is greater than another.

For list termination the rule seems to be two new lines followed by
not a list element. As a result of this, my inclination is to only
parse plain list elements and reconstruct the whole "list" only as an
internal semantic.

Check the behavior of
 1. to
1. see
  1. what
  1. I
   1. mean
   1.
1.
  1.



Re: official orgmode parser

2020-10-26 Thread Ken Mankoff


On 2020-10-26 at 14:00 -07, Tom Gillespie  wrote...
>> list ::= ('+' string newline)+ sublist?
>> sublist ::= (indent list)+
>>
>> I think this captures lists?
>
> Ah yes, I see my mistake here. In order for this to work the parser
> has to implement significant whitespace, so whitespace cannot be
> parsed into a single token. I think everything works out after that.

If we agree that the syntax above captures lists and sublists, then I think we 
could apply the same methods to the issue of headlines and sub-headlines?

  -k.



Re: official orgmode parser

2020-10-26 Thread Tom Gillespie
Here is an attempt to clarify my own confusion around the nested
structures in org. In short: each node in the headline tree and the
plain list tree can be parse using the EBNF, the nesting level cannot,
which means that certain useful operations such as folding, require
additional rules beyond the grammar. More in line. Best!
Tom

> Do you need to? This is valid as an entire Org file, I think:
>
> *** foo
> * bar
> * baz
>
> And that can be represented in EBNF. I'm not aware of places where behavior 
> is indent-level specific, except inline tasks, and that edge case can be 
> represented.

You are correct, and as long as the heading depth doesn't change some
interpretation then this is a non-issue. The reason I mentioned this
though is
because it means that you cannot determine how to correctly fold an
org file from the grammar alone.

To make sure I understand. It is possible to determine the number of
leading stars (and thus the level), but I think that it is not
possible to identify the end of a section.
For example

* a
*** b
** c
* d

You can parse out a 1, b 3, c 2, d 1, but if you want to be able to
nest b and c inside a but not nest d inside a, then you need a stack
in there somewhere. You
can't have a rule such as

section : headline content
content : text | section

because the parse would incorrectly nest sections at the same level,
you would have to write

section-level-1 : headline-1 content-1
content-1 : text | section-level-2-n

but since we have an arbitrary number of levels the grammar would have
to be infinite.
This is only if you want your grammar to be able to encode that the
content of sections
can include other more deeply nested sections, which in this context
we almost certainly
do not (as you point out).

> > There is a similar issue with the indentation level in
> > order to correctly interpret plain lists.
>
> list ::= ('+' string newline)+ sublist?
> sublist ::= (indent list)+
>
> I think this captures lists?

Ah yes, I see my mistake here. In order for this to work the parser
has to implement significant whitespace,
so whitespace cannot be parsed into a single token. I think everything
works out after that.

> Definitely not able to be represented in EBNF, unless as you say {name} is a 
> limited vocabulary.

Darn those pesky open sets!



Re: official orgmode parser

2020-10-26 Thread Ken Mankoff


On 2020-10-26 at 10:59 -07, Tom Gillespie  wrote...
> You can identify headlines, but you can't identify nesting level;

Do you need to? This is valid as an entire Org file, I think:

*** foo
* bar
* baz

And that can be represented in EBNF. I'm not aware of places where behavior is 
indent-level specific, except inline tasks, and that edge case can be 
represented.

> There is a similar issue with the indentation level in
> order to correctly interpret plain lists.

list ::= ('+' string newline)+ sublist?
sublist ::= (indent list)+

I think this captures lists?

> Another example of something that requires a stack is the greater
> blocks, where you have #+begin_{name} and #+end_{name}, and the names
> must match.

Definitely not able to be represented in EBNF, unless as you say {name} is a 
limited vocabulary.

  -k.



Re: official orgmode parser

2020-10-26 Thread Tom Gillespie
I started writing down Org's grammar as an EBNF (with Racket's #lang
brag) on Saturday. There is indeed a layer of Org grammar that can be
implemented via EBNF, but it is fairly minimal. You can identify
headlines, but you can't identify nesting level; the arbitrary nesting
depth means that you have to have a stack to keep track. There is a
similar issue with the indentation level in order to correctly
interpret plain lists. If the canonical representation of an org
document was required to used org-adapt-indentation: nil;
org-edit-src-content-indentation: 0 and there was a canonical
normalization function some of these issues would go away, but not all
of them, and I'm fairly certain that it is not possible to implement a
safe normalization function that won't mangle someones formatting.
Another example of something that requires a stack is the greater
blocks, where you have #+begin_{name} and #+end_{name}, and the names
must match. If there was a closed set of names you could sort of do it
by hand, but since name can be any string that does not contain
whitespace, you have to have a stack to track which block you are in.
So, you can identify things that are heads, you can identify things
that are block start lines and block end lines, but you need stacks to
keep track of heading level, indentation, plain list level, and block
name. I might be missing a few other places where stacks are required,
but those are the big ones. Best,
Tom

On Mon, Oct 26, 2020 at 12:48 PM Ken Mankoff  wrote:
>
>
> On 2020-10-26 at 09:24 -07, Nicolas Goaziou  wrote...
> > # This is a comment (1)
> >
> > #+begin_example
> > # This is not a comment (2)
> > #+end_example
> >
> > AFAICT, you cannot distinguish between lines (1) and (2) with EBNF.
>
> I agree. I think this is a better (correct?) example than the footnotes on 
> Org Syntax page.
>
>   -k.
>
>



Re: official orgmode parser

2020-10-26 Thread Ken Mankoff


On 2020-10-26 at 09:24 -07, Nicolas Goaziou  wrote...
> # This is a comment (1)
>
> #+begin_example
> # This is not a comment (2)
> #+end_example
>
> AFAICT, you cannot distinguish between lines (1) and (2) with EBNF.

I agree. I think this is a better (correct?) example than the footnotes on Org 
Syntax page.

  -k.




Re: official orgmode parser

2020-10-26 Thread Nicolas Goaziou
Ken Mankoff  writes:

> Yes, I meant to write that I think Org syntax is maybe *not*
> context-free, and therefore EBNF can't capture all of it. But it could
> still be very helpful and capture most of it.

I'm not arguing about the usefulness of a partial EBNF description. I'm
merely pointing out that the syntax is not context-free. Here is an
example:

# This is a comment (1)

#+begin_example
# This is not a comment (2)
#+end_example

AFAICT, you cannot distinguish between lines (1) and (2) with EBNF.

Regards,



Re: official orgmode parser

2020-10-26 Thread Ken Mankoff

On 2020-10-26 at 07:21 -07, Nicolas Goaziou  wrote...
> Ken Mankoff  writes:
>
>> I question if this is possible because EBNF is for context-free
>> grammars, but I *think* Org syntax is context-free.
>
> It's not as explained in a footnote in the Org syntax document.

Yes, I meant to write that I think Org syntax is maybe *not* context-free, and 
therefore EBNF can't capture all of it. But it could still be very helpful and 
capture most of it.

But the more I think about it, the more I think Org may be context-free.

For the footnotes, I'm not sure that "(1) In particular, the parser requires 
stars at column 0 to be quoted by a comma when they do not define a headline" 
violates context. An "*" in the first column defines a header. It can be 
escaped by anything else too (" *" works too). If ",*" has a special meaning, 
that can be captured elsewhere in the syntax.

I'm also not sure (2) violates context-freeness, at least in the EBNF sense 
where a context can include a newline. See for example:

section ::= "*"+ string (tag+) newline (planning newline)? (property_drawer 
newline)?

planning ::= ("SCHEDULED:" "<" date_or_time ">")? ("DEADLINE:" "<" date_or_time 
">")?

property_drawer ::= ":PROPERTIES:" newline drawer_contents newline ":END:"

drawer_contents ::= ":" property ":" whitespace string

Where the first line, "section" is represented graphically as the attached 
image.

I guess I'm not 100% clear what "context-free" means. EBNF can represent a 
language where a for loop has an opening and closing brace. The closing brace 
is context-dependent, just as the planning or property drawers are.

I recently used EBNF to represent a CSV file with header, and I was unable to 
capture the requirement that the header column must have the same number of 
fields or commas as the data section. I think that is context-free. 



  -k.


Re: official orgmode parser

2020-10-26 Thread Nicolas Goaziou
Hello,

Ken Mankoff  writes:

> I question if this is possible because EBNF is for context-free
> grammars, but I *think* Org syntax is context-free.

It's not as explained in a footnote in the Org syntax document.

Regards,
-- 
Nicolas Goaziou



Re: official orgmode parser

2020-10-26 Thread Ken Mankoff
Hello,

On 2020-09-23 at 01:09 -07, Bastien  wrote...
> I disagree that a parser is too difficult to maintain because Org is a
> moving target. Org core syntax is not moving anymore, a parser can
> reasonably target it. That's what is done with the Ruby parser, in use
> in this small project called github.com :)

Do you think it would be useful (or possible) to represent the current Org 
syntax in EBNF form so that people can use the EBNF to build parsers or 
graphically understand the form? I'm thinking of a nice page of railroad 
diagrams from this tool: https://github.com/GuntherRademacher/rr

I question if this is possible because EBNF is for context-free grammars, but I 
*think* Org syntax is context-free. Even if not, I think those railroad 
diagrams might be useful for parser-writers and can still describe 99 % of the 
syntax, even if a few extra sentences are needed to clarify some edge case.

  -k.



Re: official orgmode parser

2020-10-24 Thread Tom Gillespie
> which Ruby org-mode parser does Github use?

I'm pretty sure that github uses https://github.com/wallyqs/org-ruby.
It is ... not compliant, shall we say. I have making some fixes to the
footnote parsing section on my todo list, but I don't expect to get to
it any time in the near future.

Tom



Re: official orgmode parser

2020-10-24 Thread Daniele Nicolodi
On 23/09/2020 10:09, Bastien wrote:
> I disagree that a parser is too difficult to maintain because Org is 
> a moving target.  Org core syntax is not moving anymore, a parser can
> reasonably target it.  That's what is done with the Ruby parser, in
> use in this small project called github.com :)

(Just an aside: which Ruby org-mode parser does Github use? I sometime
find instances where Github does not render an org-mode file correclty
and I would be happy to file bugs to have them corrected).

> So I'd say:
> 
> - let's enhance Worg's documentation
> - yes, please go for enhancing parsing tools
> 
> I don't think we need official tools.  The official Org parser exists,
> it is Org itself.

Would it make sense to have one "official" (or a set of) org-mode test
files and the corresponding syntax tree as parsed by org-elements (maybe
in a format easier to read from other programming languages than
s-expressions, json maybe?) to make testing other parser against the
reference implementation easier?

Maybe the org-mode test suite already has something like this. I haven't
looked for it yet.

Cheers,
Dan



Re: official orgmode parser

2020-09-23 Thread rey-coyrehourcq
Hi Przemysław,

Some partial org Parsers (AST or regex...) i found on the web for a recent 
state of the art : 

* org-js
https://github.com/mooz/org-js

* orgajs
Orga is a flexible org-mode syntax parser. It parses org content into AST 
(Abstract Syntax Tree)
https://github.com/orgapp/orgajs
* orgparse
* org-mode-parser
https://github.com/daitangio/org-mode-parser
* org-rs
https://github.com/org-rs/org-rs
* org-ruby
https://github.com/wallyqs/org-ruby
* org-swift
https://github.com/orgapp/swift-org
* organice
https://github.com/200ok-ch/organice
* organum
https://github.com/seylerius/organum
* clj org
https://github.com/eigenhombre/clj-org
* orgmode-parse
https://github.com/ixmatus/orgmode-parse
* org-mode
https://www.fosskers.ca/
https://hackage.haskell.org/package/org-mode
* orgize
https://github.com/PoiScript/orgize
https://www.worthe-it.co.za/blog.html

Best regards,

Le mercredi 23 septembre 2020 à 19:46 +0200, Przemysław Kamiński a écrit :
> On 9/23/20 10:09 AM, Bastien wrote:
> > Hi Przemysław,
> > 
> > Przemysław Kamiński  writes:
> > 
> > > I oftentimes find myself needing to parse org files with some external
> > > tools (to generate reports for customers or sum up clock times for
> > > given month, etc). Looking through the list
> > > 
> > > https://orgmode.org/worg/org-tools/
> > 
> > Can you help on making the above page more useful to anyone?
> > 
> > Perhaps we can have a separate worg page just for parsers, reporting
> > the ones that seem to fully work.
> > 
> > I disagree that a parser is too difficult to maintain because Org is
> > a moving target.  Org core syntax is not moving anymore, a parser can
> > reasonably target it.  That's what is done with the Ruby parser, in
> > use in this small project called github.com :)
> > 
> > So I'd say:
> > 
> > - let's enhance Worg's documentation
> > - yes, please go for enhancing parsing tools
> > 
> > I don't think we need official tools.  The official Org parser exists,
> > it is Org itself.
> > 
> > Thanks,
> > 
> 
> Hello Bastien,
> 
> Thank you for your remarks.
> 
> I updated the README, hopefully it's more usable now.
> 
> Przemek
> 
-- 


Sébastien Rey-Coyrehourcq
Research Engineer UMR IDEES
02.35.14.69.30

{Stronger security for your email, follow EFF tutorial : https://ssd.eff.org/}






Re: official orgmode parser

2020-09-23 Thread Przemysław Kamiński

On 9/23/20 10:09 AM, Bastien wrote:

Hi Przemysław,

Przemysław Kamiński  writes:


I oftentimes find myself needing to parse org files with some external
tools (to generate reports for customers or sum up clock times for
given month, etc). Looking through the list

https://orgmode.org/worg/org-tools/


Can you help on making the above page more useful to anyone?

Perhaps we can have a separate worg page just for parsers, reporting
the ones that seem to fully work.

I disagree that a parser is too difficult to maintain because Org is
a moving target.  Org core syntax is not moving anymore, a parser can
reasonably target it.  That's what is done with the Ruby parser, in
use in this small project called github.com :)

So I'd say:

- let's enhance Worg's documentation
- yes, please go for enhancing parsing tools

I don't think we need official tools.  The official Org parser exists,
it is Org itself.

Thanks,



Hello Bastien,

Thank you for your remarks.

I updated the README, hopefully it's more usable now.

Przemek



Re: official orgmode parser

2020-09-23 Thread Bastien
Hi Gerry,

"Gerry Agbobada"  writes:

> Having a tree-sitter parser would be really great in my opinion

1+

Thanks for working on this, let us know how it goes!

-- 
 Bastien



Re: official orgmode parser

2020-09-23 Thread Bastien
Hi Przemysław,

Przemysław Kamiński  writes:

> I oftentimes find myself needing to parse org files with some external
> tools (to generate reports for customers or sum up clock times for
> given month, etc). Looking through the list
>
> https://orgmode.org/worg/org-tools/

Can you help on making the above page more useful to anyone?

Perhaps we can have a separate worg page just for parsers, reporting
the ones that seem to fully work.

I disagree that a parser is too difficult to maintain because Org is 
a moving target.  Org core syntax is not moving anymore, a parser can
reasonably target it.  That's what is done with the Ruby parser, in
use in this small project called github.com :)

So I'd say:

- let's enhance Worg's documentation
- yes, please go for enhancing parsing tools

I don't think we need official tools.  The official Org parser exists,
it is Org itself.

Thanks,

-- 
 Bastien



Re: official orgmode parser

2020-09-17 Thread Przemysław Kamiński

On 9/17/20 3:18 AM, Ihor Radchenko wrote:

So basically this is what this thread is about. One needs a working
Emacs instance and work in "push" mode to export any Org data. This
requires dealing with temporary files, as described above, and some
ad-hoc formats to keep whatever data I need to pull from org.



"Pull" mode would be preferred. I could then, say, write a script in
Guile, execute 'emacs -batch' to export org data (I'm ok with that),
then parse the S-expressions to get what I need.


My choice to use "push" mode is just for performance reasons. Nothing
prevents you from writing a function called from emacs --batch that
converts parsed org data into whatever format your Guile script prefers.
That function may be either on Emacs side or on Guile side. Probably,
Emacs has more capabilities when dealing with s-expressions though.

You can even directly push the information from Emacs to API server.
You may find https://github.com/tkf/emacs-request useful for this task.

Finally, you may also consider clock tables to create clock summaries
using existing org-mode functionality. The tables can be named and
accessed using any programming language via babel.

Best,
Ihor


Przemysław Kamiński  writes:


On 9/16/20 2:02 PM, Ihor Radchenko wrote:

However what Ihor presented is interesting. Do you use similar approach
with shellout and 'emacs -batch' to show currently running task or you
'push' data from emacs to show it in the taskbar?


I prefer to avoid querying emacs too often for performance reasons.
Instead, I only update the clocking info when I clock in/out in emacs.
Then, the clocked in time is dynamically updated by independent bash
script.

The scheme is the following:
1. org clock in/out in Emacs trigger writing clocking info into
 ~/.org-clock-in status file
2. bash script periodically monitors the file and calculates the clocked
 in time according to the contents and time from last modification
3. the script updates simple textbox widget using awesome-client
4. the script also warns me (notify-send) when the weighted clocked in
 time is negative (meaning that I should switch to some more
 productive activity)

Best,
Ihor

Przemysław Kamiński  writes:


On 9/16/20 9:56 AM, Ihor Radchenko wrote:

Wow, another awesomewm user here; could you share your code?


Are you interested in something particular about awesome WM integration?

I am using simple textbox widgets to show currently clocked in task and
weighted summary of clocked time. See the attachments.

Best,
Ihor




Marcin Borkowski  writes:


On 2020-09-15, at 11:17, Przemysław Kamiński  wrote:


So, I keep clock times for work in org mode, this is very
handy. However, my customers require that I use their service to
provide the times. They do offer API. So basically I'm using elisp to
parse org, make API calls, and at the same time generate CSV reports
with a Python interop with org babel (because my elisp is just too bad
to do that). If I had access to some org parser, I'd pick a language
that would be more comfortable for me to get the job done. I guess it
can all be done in elisp, however this is just a tool for me alone and
I have limited time resources on hacking things for myself :)


I was in the exact same situation - I use Org-mode clocking, and we use
Toggl at our company, so I wrote a simple tool to fire API requests to
Toggl on clock start/cancel/end: https://github.com/mbork/org-toggl
It's a bit more than 200 lines of Elisp, so you might try to look into
it and adapt it to whatever tool your employer is using.


Another one is generating total hours report for day/week/month to put
into my awesomewm toolbar. I ended up using orgstat
https://github.com/volhovM/orgstat
however the author is creating his own DSL in YAML and I guess things
were much better off if it all stayed in some Scheme :)


Wow, another awesomewm user here; could you share your code?

Best,

--
Marcin Borkowski
http://mbork.pl



I don't have interesting code, just standard awesomevm setup. I run
periodic script to output data computed by orgstat and show it in the
taskbar (uses the shellout_widget).

However what Ihor presented is interesting. Do you use similar approach
with shellout and 'emacs -batch' to show currently running task or you
'push' data from emacs to show it in the taskbar?

P.



So basically this is what this thread is about. One needs a working
Emacs instance and work in "push" mode to export any Org data. This
requires dealing with temporary files, as described above, and some
ad-hoc formats to keep whatever data I need to pull from org.

"Pull" mode would be preferred. I could then, say, write a script in
Guile, execute 'emacs -batch' to export org data (I'm ok with that),
then parse the S-expressions to get what I need.

P.




OK so this is what I got so far
https://gitlab.com/cgenie/org-parse
I stole the simple test.org file from ox-json test suite.
Guile seems to correctly parse that output. At least something 

Re: official orgmode parser

2020-09-16 Thread Ihor Radchenko
> So basically this is what this thread is about. One needs a working 
> Emacs instance and work in "push" mode to export any Org data. This 
> requires dealing with temporary files, as described above, and some 
> ad-hoc formats to keep whatever data I need to pull from org.

> "Pull" mode would be preferred. I could then, say, write a script in 
> Guile, execute 'emacs -batch' to export org data (I'm ok with that), 
> then parse the S-expressions to get what I need.

My choice to use "push" mode is just for performance reasons. Nothing
prevents you from writing a function called from emacs --batch that
converts parsed org data into whatever format your Guile script prefers.
That function may be either on Emacs side or on Guile side. Probably,
Emacs has more capabilities when dealing with s-expressions though.

You can even directly push the information from Emacs to API server.
You may find https://github.com/tkf/emacs-request useful for this task.

Finally, you may also consider clock tables to create clock summaries
using existing org-mode functionality. The tables can be named and
accessed using any programming language via babel.

Best,
Ihor


Przemysław Kamiński  writes:

> On 9/16/20 2:02 PM, Ihor Radchenko wrote:
>>> However what Ihor presented is interesting. Do you use similar approach
>>> with shellout and 'emacs -batch' to show currently running task or you
>>> 'push' data from emacs to show it in the taskbar?
>> 
>> I prefer to avoid querying emacs too often for performance reasons.
>> Instead, I only update the clocking info when I clock in/out in emacs.
>> Then, the clocked in time is dynamically updated by independent bash
>> script.
>> 
>> The scheme is the following:
>> 1. org clock in/out in Emacs trigger writing clocking info into
>> ~/.org-clock-in status file
>> 2. bash script periodically monitors the file and calculates the clocked
>> in time according to the contents and time from last modification
>> 3. the script updates simple textbox widget using awesome-client
>> 4. the script also warns me (notify-send) when the weighted clocked in
>> time is negative (meaning that I should switch to some more
>> productive activity)
>> 
>> Best,
>> Ihor
>> 
>> Przemysław Kamiński  writes:
>> 
>>> On 9/16/20 9:56 AM, Ihor Radchenko wrote:
> Wow, another awesomewm user here; could you share your code?

 Are you interested in something particular about awesome WM integration?

 I am using simple textbox widgets to show currently clocked in task and
 weighted summary of clocked time. See the attachments.

 Best,
 Ihor




 Marcin Borkowski  writes:

> On 2020-09-15, at 11:17, Przemysław Kamiński  wrote:
>
>> So, I keep clock times for work in org mode, this is very
>> handy. However, my customers require that I use their service to
>> provide the times. They do offer API. So basically I'm using elisp to
>> parse org, make API calls, and at the same time generate CSV reports
>> with a Python interop with org babel (because my elisp is just too bad
>> to do that). If I had access to some org parser, I'd pick a language
>> that would be more comfortable for me to get the job done. I guess it
>> can all be done in elisp, however this is just a tool for me alone and
>> I have limited time resources on hacking things for myself :)
>
> I was in the exact same situation - I use Org-mode clocking, and we use
> Toggl at our company, so I wrote a simple tool to fire API requests to
> Toggl on clock start/cancel/end: https://github.com/mbork/org-toggl
> It's a bit more than 200 lines of Elisp, so you might try to look into
> it and adapt it to whatever tool your employer is using.
>
>> Another one is generating total hours report for day/week/month to put
>> into my awesomewm toolbar. I ended up using orgstat
>> https://github.com/volhovM/orgstat
>> however the author is creating his own DSL in YAML and I guess things
>> were much better off if it all stayed in some Scheme :)
>
> Wow, another awesomewm user here; could you share your code?
>
> Best,
>
> -- 
> Marcin Borkowski
> http://mbork.pl
>>>
>>>
>>> I don't have interesting code, just standard awesomevm setup. I run
>>> periodic script to output data computed by orgstat and show it in the
>>> taskbar (uses the shellout_widget).
>>>
>>> However what Ihor presented is interesting. Do you use similar approach
>>> with shellout and 'emacs -batch' to show currently running task or you
>>> 'push' data from emacs to show it in the taskbar?
>>>
>>> P.
>
>
> So basically this is what this thread is about. One needs a working 
> Emacs instance and work in "push" mode to export any Org data. This 
> requires dealing with temporary files, as described above, and some 
> ad-hoc formats to keep whatever data I need to pull from org.
>
> "Pull" mode would be 

Re: official orgmode parser

2020-09-16 Thread Matt Huszagh
"Gerry Agbobada"  writes:

> I'm currently toying with the idea of trying a tree-sitter parser for Org. 
> The very static nature of a shared object parser (knowing TODO keywords are 
> pretty dynamic for example) is a challenge I'm not sure to overcome ; to be 
> honest even without that I can't say I'll manage to do it.

A tree-sitter parser for org would be great! Please keep this list
posted on any developments you make on this front. I made some minimal
attempts at this a while back, but didn't get very far.

Matt



Re: official orgmode parser

2020-09-16 Thread Ihor Radchenko
FYI: You may find https://github.com/ndwarshuis/org-ml helpful.


Przemysław Kamiński  writes:

> On 9/15/20 2:37 PM, to...@tuxteam.de wrote:
>> On Tue, Sep 15, 2020 at 01:15:56PM +0200, Przemysław Kamiński wrote:
>> 
>> [...]
>> 
>>> There's the org-json (or ox-json) package but for some reason I
>>> wasn't able to run it successfully. I guess export to S-exps would
>>> be best here. But yes I'll check that out.
>> 
>> If that's your route, perhaps the "Org element API" [1] might be
>> helpful. Especially `org-element-parse-buffer' gives you a Lisp
>> data structure which is supposed to be a parse of your Org buffer.
>> 
>>  From there to S-expression can be trivial (e.g. `print' or `pp'),
>> depending on what you want to do.
>> 
>> Walking the structure should be nice in Lisp, too.
>> 
>> The topic of (non-Emacs) parsing of Org comes up regularly, and
>> there is a good (but AFAIK not-quite-complete) Org syntax spec
>> in Worg [2], but there are a couple of difficulties to be mastered
>> before such a thing can become really enjoyable and useful.
>> 
>> The loose specification of Org's format (arguably its second
>> or third strongest asset, the first two being its incredible
>> community and Emacs itself) is something which makes this
>> problem "interesting". People have invented lots of usages
>> which might be broken should Org change to a strict formal
>> spec. You don't want to break those people.
>> 
>> But yes, perhaps some day someone nails it. Perhaps it's you :)
>> 
>> Cheers
>> 
>> [1] https://orgmode.org/worg/dev/org-element-api.html
>> [2] https://orgmode.org/worg/dev/org-syntax.html
>> 
>>   - t
>> 
>
> So I looked at (pp (org-element-parse-buffer)) however it does print out 
> recursive stuff which other schemes have trouble parsing.
>
> My code looks more or less like this:
>
> (defun org-parse (f)
>(with-temp-buffer
>  (find-file f)
>  (let* ((parsed (org-element-parse-buffer))
> (all (append org-element-all-elements org-element-all-objects))
> (mapped (org-element-map parsed all
>   (lambda (item)
> (strip-parent item)
>(pp mapped
>
>
> strip-parent is basically (plist-put props :parent nil) for elements 
> properties. However it turns out there are more recursive objects, like
>
> :title
>#("Headline 1" 0 10
>  (:parent
>   (headline #2
> (section
>
> So I'm wondering do I have to do it by hand for all cases or is there 
> some way to output only a simple AST without those nested objects?
>
> Best,
> Przemek



Re: official orgmode parser

2020-09-16 Thread Przemysław Kamiński

On 9/16/20 2:02 PM, Ihor Radchenko wrote:

However what Ihor presented is interesting. Do you use similar approach
with shellout and 'emacs -batch' to show currently running task or you
'push' data from emacs to show it in the taskbar?


I prefer to avoid querying emacs too often for performance reasons.
Instead, I only update the clocking info when I clock in/out in emacs.
Then, the clocked in time is dynamically updated by independent bash
script.

The scheme is the following:
1. org clock in/out in Emacs trigger writing clocking info into
~/.org-clock-in status file
2. bash script periodically monitors the file and calculates the clocked
in time according to the contents and time from last modification
3. the script updates simple textbox widget using awesome-client
4. the script also warns me (notify-send) when the weighted clocked in
time is negative (meaning that I should switch to some more
productive activity)

Best,
Ihor

Przemysław Kamiński  writes:


On 9/16/20 9:56 AM, Ihor Radchenko wrote:

Wow, another awesomewm user here; could you share your code?


Are you interested in something particular about awesome WM integration?

I am using simple textbox widgets to show currently clocked in task and
weighted summary of clocked time. See the attachments.

Best,
Ihor




Marcin Borkowski  writes:


On 2020-09-15, at 11:17, Przemysław Kamiński  wrote:


So, I keep clock times for work in org mode, this is very
handy. However, my customers require that I use their service to
provide the times. They do offer API. So basically I'm using elisp to
parse org, make API calls, and at the same time generate CSV reports
with a Python interop with org babel (because my elisp is just too bad
to do that). If I had access to some org parser, I'd pick a language
that would be more comfortable for me to get the job done. I guess it
can all be done in elisp, however this is just a tool for me alone and
I have limited time resources on hacking things for myself :)


I was in the exact same situation - I use Org-mode clocking, and we use
Toggl at our company, so I wrote a simple tool to fire API requests to
Toggl on clock start/cancel/end: https://github.com/mbork/org-toggl
It's a bit more than 200 lines of Elisp, so you might try to look into
it and adapt it to whatever tool your employer is using.


Another one is generating total hours report for day/week/month to put
into my awesomewm toolbar. I ended up using orgstat
https://github.com/volhovM/orgstat
however the author is creating his own DSL in YAML and I guess things
were much better off if it all stayed in some Scheme :)


Wow, another awesomewm user here; could you share your code?

Best,

--
Marcin Borkowski
http://mbork.pl



I don't have interesting code, just standard awesomevm setup. I run
periodic script to output data computed by orgstat and show it in the
taskbar (uses the shellout_widget).

However what Ihor presented is interesting. Do you use similar approach
with shellout and 'emacs -batch' to show currently running task or you
'push' data from emacs to show it in the taskbar?

P.



So basically this is what this thread is about. One needs a working 
Emacs instance and work in "push" mode to export any Org data. This 
requires dealing with temporary files, as described above, and some 
ad-hoc formats to keep whatever data I need to pull from org.


"Pull" mode would be preferred. I could then, say, write a script in 
Guile, execute 'emacs -batch' to export org data (I'm ok with that), 
then parse the S-expressions to get what I need.


P.



Re: official orgmode parser

2020-09-16 Thread tomas
On Wed, Sep 16, 2020 at 02:09:42PM +0200, Przemysław Kamiński wrote:

[...]

> So I looked at (pp (org-element-parse-buffer)) however it does print
> out recursive stuff which other schemes have trouble parsing.
> 
> My code looks more or less like this:
> 
> (defun org-parse (f)
>   (with-temp-buffer
> (find-file f)
> (let* ((parsed (org-element-parse-buffer))
>(all (append org-element-all-elements org-element-all-objects))
>(mapped (org-element-map parsed all
>  (lambda (item)
>(strip-parent item)
>   (pp mapped

Actually I'd tend to not modify the result, but to walk
it.

See `pcase' for a powerful pattern matcher which might
help you there.

Cheers
 - t


signature.asc
Description: Digital signature


Re: official orgmode parser

2020-09-16 Thread Przemysław Kamiński

On 9/15/20 2:37 PM, to...@tuxteam.de wrote:

On Tue, Sep 15, 2020 at 01:15:56PM +0200, Przemysław Kamiński wrote:

[...]


There's the org-json (or ox-json) package but for some reason I
wasn't able to run it successfully. I guess export to S-exps would
be best here. But yes I'll check that out.


If that's your route, perhaps the "Org element API" [1] might be
helpful. Especially `org-element-parse-buffer' gives you a Lisp
data structure which is supposed to be a parse of your Org buffer.

 From there to S-expression can be trivial (e.g. `print' or `pp'),
depending on what you want to do.

Walking the structure should be nice in Lisp, too.

The topic of (non-Emacs) parsing of Org comes up regularly, and
there is a good (but AFAIK not-quite-complete) Org syntax spec
in Worg [2], but there are a couple of difficulties to be mastered
before such a thing can become really enjoyable and useful.

The loose specification of Org's format (arguably its second
or third strongest asset, the first two being its incredible
community and Emacs itself) is something which makes this
problem "interesting". People have invented lots of usages
which might be broken should Org change to a strict formal
spec. You don't want to break those people.

But yes, perhaps some day someone nails it. Perhaps it's you :)

Cheers

[1] https://orgmode.org/worg/dev/org-element-api.html
[2] https://orgmode.org/worg/dev/org-syntax.html

  - t



So I looked at (pp (org-element-parse-buffer)) however it does print out 
recursive stuff which other schemes have trouble parsing.


My code looks more or less like this:

(defun org-parse (f)
  (with-temp-buffer
(find-file f)
(let* ((parsed (org-element-parse-buffer))
   (all (append org-element-all-elements org-element-all-objects))
   (mapped (org-element-map parsed all
 (lambda (item)
   (strip-parent item)
  (pp mapped


strip-parent is basically (plist-put props :parent nil) for elements 
properties. However it turns out there are more recursive objects, like


:title
  #("Headline 1" 0 10
(:parent
 (headline #2
   (section

So I'm wondering do I have to do it by hand for all cases or is there 
some way to output only a simple AST without those nested objects?


Best,
Przemek



Re: official orgmode parser

2020-09-16 Thread Ihor Radchenko
> However what Ihor presented is interesting. Do you use similar approach 
> with shellout and 'emacs -batch' to show currently running task or you 
> 'push' data from emacs to show it in the taskbar?

I prefer to avoid querying emacs too often for performance reasons.
Instead, I only update the clocking info when I clock in/out in emacs.
Then, the clocked in time is dynamically updated by independent bash
script.

The scheme is the following:
1. org clock in/out in Emacs trigger writing clocking info into
   ~/.org-clock-in status file
2. bash script periodically monitors the file and calculates the clocked
   in time according to the contents and time from last modification
3. the script updates simple textbox widget using awesome-client
4. the script also warns me (notify-send) when the weighted clocked in
   time is negative (meaning that I should switch to some more
   productive activity)

Best,
Ihor

Przemysław Kamiński  writes:

> On 9/16/20 9:56 AM, Ihor Radchenko wrote:
>>> Wow, another awesomewm user here; could you share your code?
>> 
>> Are you interested in something particular about awesome WM integration?
>> 
>> I am using simple textbox widgets to show currently clocked in task and
>> weighted summary of clocked time. See the attachments.
>> 
>> Best,
>> Ihor
>> 
>> 
>> 
>> 
>> Marcin Borkowski  writes:
>> 
>>> On 2020-09-15, at 11:17, Przemysław Kamiński  wrote:
>>>
 So, I keep clock times for work in org mode, this is very
 handy. However, my customers require that I use their service to
 provide the times. They do offer API. So basically I'm using elisp to
 parse org, make API calls, and at the same time generate CSV reports
 with a Python interop with org babel (because my elisp is just too bad
 to do that). If I had access to some org parser, I'd pick a language
 that would be more comfortable for me to get the job done. I guess it
 can all be done in elisp, however this is just a tool for me alone and
 I have limited time resources on hacking things for myself :)
>>>
>>> I was in the exact same situation - I use Org-mode clocking, and we use
>>> Toggl at our company, so I wrote a simple tool to fire API requests to
>>> Toggl on clock start/cancel/end: https://github.com/mbork/org-toggl
>>> It's a bit more than 200 lines of Elisp, so you might try to look into
>>> it and adapt it to whatever tool your employer is using.
>>>
 Another one is generating total hours report for day/week/month to put
 into my awesomewm toolbar. I ended up using orgstat
 https://github.com/volhovM/orgstat
 however the author is creating his own DSL in YAML and I guess things
 were much better off if it all stayed in some Scheme :)
>>>
>>> Wow, another awesomewm user here; could you share your code?
>>>
>>> Best,
>>>
>>> -- 
>>> Marcin Borkowski
>>> http://mbork.pl
>
>
> I don't have interesting code, just standard awesomevm setup. I run 
> periodic script to output data computed by orgstat and show it in the 
> taskbar (uses the shellout_widget).
>
> However what Ihor presented is interesting. Do you use similar approach 
> with shellout and 'emacs -batch' to show currently running task or you 
> 'push' data from emacs to show it in the taskbar?
>
> P.



Re: official orgmode parser

2020-09-16 Thread Przemysław Kamiński

On 9/16/20 9:56 AM, Ihor Radchenko wrote:

Wow, another awesomewm user here; could you share your code?


Are you interested in something particular about awesome WM integration?

I am using simple textbox widgets to show currently clocked in task and
weighted summary of clocked time. See the attachments.

Best,
Ihor




Marcin Borkowski  writes:


On 2020-09-15, at 11:17, Przemysław Kamiński  wrote:


So, I keep clock times for work in org mode, this is very
handy. However, my customers require that I use their service to
provide the times. They do offer API. So basically I'm using elisp to
parse org, make API calls, and at the same time generate CSV reports
with a Python interop with org babel (because my elisp is just too bad
to do that). If I had access to some org parser, I'd pick a language
that would be more comfortable for me to get the job done. I guess it
can all be done in elisp, however this is just a tool for me alone and
I have limited time resources on hacking things for myself :)


I was in the exact same situation - I use Org-mode clocking, and we use
Toggl at our company, so I wrote a simple tool to fire API requests to
Toggl on clock start/cancel/end: https://github.com/mbork/org-toggl
It's a bit more than 200 lines of Elisp, so you might try to look into
it and adapt it to whatever tool your employer is using.


Another one is generating total hours report for day/week/month to put
into my awesomewm toolbar. I ended up using orgstat
https://github.com/volhovM/orgstat
however the author is creating his own DSL in YAML and I guess things
were much better off if it all stayed in some Scheme :)


Wow, another awesomewm user here; could you share your code?

Best,

--
Marcin Borkowski
http://mbork.pl



I don't have interesting code, just standard awesomevm setup. I run 
periodic script to output data computed by orgstat and show it in the 
taskbar (uses the shellout_widget).


However what Ihor presented is interesting. Do you use similar approach 
with shellout and 'emacs -batch' to show currently running task or you 
'push' data from emacs to show it in the taskbar?


P.



Re: official orgmode parser

2020-09-16 Thread Ihor Radchenko
> Wow, another awesomewm user here; could you share your code?

Are you interested in something particular about awesome WM integration?

I am using simple textbox widgets to show currently clocked in task and
weighted summary of clocked time. See the attachments.

Best,
Ihor



Marcin Borkowski  writes:

> On 2020-09-15, at 11:17, Przemysław Kamiński  wrote:
>
>> So, I keep clock times for work in org mode, this is very
>> handy. However, my customers require that I use their service to
>> provide the times. They do offer API. So basically I'm using elisp to
>> parse org, make API calls, and at the same time generate CSV reports
>> with a Python interop with org babel (because my elisp is just too bad
>> to do that). If I had access to some org parser, I'd pick a language
>> that would be more comfortable for me to get the job done. I guess it
>> can all be done in elisp, however this is just a tool for me alone and
>> I have limited time resources on hacking things for myself :)
>
> I was in the exact same situation - I use Org-mode clocking, and we use
> Toggl at our company, so I wrote a simple tool to fire API requests to
> Toggl on clock start/cancel/end: https://github.com/mbork/org-toggl
> It's a bit more than 200 lines of Elisp, so you might try to look into
> it and adapt it to whatever tool your employer is using.
>
>> Another one is generating total hours report for day/week/month to put
>> into my awesomewm toolbar. I ended up using orgstat
>> https://github.com/volhovM/orgstat
>> however the author is creating his own DSL in YAML and I guess things
>> were much better off if it all stayed in some Scheme :)
>
> Wow, another awesomewm user here; could you share your code?
>
> Best,
>
> -- 
> Marcin Borkowski
> http://mbork.pl


Re: official orgmode parser

2020-09-16 Thread Marcin Borkowski


On 2020-09-15, at 11:17, Przemysław Kamiński  wrote:

> So, I keep clock times for work in org mode, this is very
> handy. However, my customers require that I use their service to
> provide the times. They do offer API. So basically I'm using elisp to
> parse org, make API calls, and at the same time generate CSV reports
> with a Python interop with org babel (because my elisp is just too bad
> to do that). If I had access to some org parser, I'd pick a language
> that would be more comfortable for me to get the job done. I guess it
> can all be done in elisp, however this is just a tool for me alone and
> I have limited time resources on hacking things for myself :)

I was in the exact same situation - I use Org-mode clocking, and we use
Toggl at our company, so I wrote a simple tool to fire API requests to
Toggl on clock start/cancel/end: https://github.com/mbork/org-toggl
It's a bit more than 200 lines of Elisp, so you might try to look into
it and adapt it to whatever tool your employer is using.

> Another one is generating total hours report for day/week/month to put
> into my awesomewm toolbar. I ended up using orgstat
> https://github.com/volhovM/orgstat
> however the author is creating his own DSL in YAML and I guess things
> were much better off if it all stayed in some Scheme :)

Wow, another awesomewm user here; could you share your code?

Best,

-- 
Marcin Borkowski
http://mbork.pl



Re: official orgmode parser

2020-09-15 Thread Tim Cross


Przemysław Kamiński  writes:

>
> So, I keep clock times for work in org mode, this is very handy. 
> However, my customers require that I use their service to provide the 
> times. They do offer API. So basically I'm using elisp to parse org, 
> make API calls, and at the same time generate CSV reports with a Python 
> interop with org babel (because my elisp is just too bad to do that). If 
> I had access to some org parser, I'd pick a language that would be more 
> comfortable for me to get the job done. I guess it can all be done in 
> elisp, however this is just a tool for me alone and I have limited time 
> resources on hacking things for myself :)
>

I would probably use org's org-export-table command to export the clock
table as a CSV and then just use a simple script to read in that CSV and
do the API calls. 

> Another one is generating total hours report for day/week/month to put 
> into my awesomewm toolbar. I ended up using orgstat
> https://github.com/volhovM/orgstat
> however the author is creating his own DSL in YAML and I guess things 
> were much better off if it all stayed in some Scheme :)
>

Sounds like you have a solution. I would probably just setup a hook to
generate the updated table and export it when the file is saved and then
have something consume that exported file to update the taskbar. 

-- 
Tim Cross



Re: official orgmode parser

2020-09-15 Thread Diego Zamboni
There's also org-ql (https://github.com/alphapapa/org-ql), which also
provides a query-based API against Org structures.

--Diego


On Tue, Sep 15, 2020 at 2:59 PM  wrote:

> On Tue, Sep 15, 2020 at 01:15:56PM +0200, Przemysław Kamiński wrote:
>
> [...]
>
> > There's the org-json (or ox-json) package but for some reason I
> > wasn't able to run it successfully. I guess export to S-exps would
> > be best here. But yes I'll check that out.
>
> If that's your route, perhaps the "Org element API" [1] might be
> helpful. Especially `org-element-parse-buffer' gives you a Lisp
> data structure which is supposed to be a parse of your Org buffer.
>
> From there to S-expression can be trivial (e.g. `print' or `pp'),
> depending on what you want to do.
>
> Walking the structure should be nice in Lisp, too.
>
> The topic of (non-Emacs) parsing of Org comes up regularly, and
> there is a good (but AFAIK not-quite-complete) Org syntax spec
> in Worg [2], but there are a couple of difficulties to be mastered
> before such a thing can become really enjoyable and useful.
>
> The loose specification of Org's format (arguably its second
> or third strongest asset, the first two being its incredible
> community and Emacs itself) is something which makes this
> problem "interesting". People have invented lots of usages
> which might be broken should Org change to a strict formal
> spec. You don't want to break those people.
>
> But yes, perhaps some day someone nails it. Perhaps it's you :)
>
> Cheers
>
> [1] https://orgmode.org/worg/dev/org-element-api.html
> [2] https://orgmode.org/worg/dev/org-syntax.html
>
>  - t
>


Re: official orgmode parser

2020-09-15 Thread tomas
On Tue, Sep 15, 2020 at 01:15:56PM +0200, Przemysław Kamiński wrote:

[...]

> There's the org-json (or ox-json) package but for some reason I
> wasn't able to run it successfully. I guess export to S-exps would
> be best here. But yes I'll check that out.

If that's your route, perhaps the "Org element API" [1] might be
helpful. Especially `org-element-parse-buffer' gives you a Lisp
data structure which is supposed to be a parse of your Org buffer.

From there to S-expression can be trivial (e.g. `print' or `pp'),
depending on what you want to do.

Walking the structure should be nice in Lisp, too.

The topic of (non-Emacs) parsing of Org comes up regularly, and
there is a good (but AFAIK not-quite-complete) Org syntax spec
in Worg [2], but there are a couple of difficulties to be mastered
before such a thing can become really enjoyable and useful.

The loose specification of Org's format (arguably its second
or third strongest asset, the first two being its incredible
community and Emacs itself) is something which makes this
problem "interesting". People have invented lots of usages
which might be broken should Org change to a strict formal
spec. You don't want to break those people.

But yes, perhaps some day someone nails it. Perhaps it's you :)

Cheers

[1] https://orgmode.org/worg/dev/org-element-api.html
[2] https://orgmode.org/worg/dev/org-syntax.html

 - t


signature.asc
Description: Digital signature


Re: official orgmode parser

2020-09-15 Thread Przemysław Kamiński

On 9/15/20 11:55 AM, Russell Adams wrote:

On Tue, Sep 15, 2020 at 11:17:57AM +0200, Przemysław Kamiński wrote:

Org mode IS an elsip application. This is the main goal. The reason it
works so well is because elisp is largely a DSL that focuses on text
manipulation and is therefore ideally suited for a text based organiser.


So, I keep clock times for work in org mode, this is very handy.
However, my customers require that I use their service to provide the
times. They do offer API. So basically I'm using elisp to parse org,
make API calls, and at the same time generate CSV reports with a Python
interop with org babel (because my elisp is just too bad to do
that).


Please consider this is a very specialized use case.


If I had access to some org parser, I'd pick a language that would
be more comfortable for me to get the job done. I guess it can all
be done in elisp, however this is just a tool for me alone and I
have limited time resources on hacking things for myself :)


Maintainer time is limited too. Maintaining a parser library outside
of Emacs would be difficult for the reasons already given. I'd
encourage you to pick up some more Elisp, which I am also trying to
do.


Anyways, my parser needs aren't that sophisticated: just parse the file,
return headings with clock drawers. I tried the common lisp library but
got frustrated after fiddling with it for couple of hours.


If it's that small you could always do that in Python with regexps for
your usage if you're more comfortable in Python. Org's plain text
format means you can read it with anything. I suspect grep might even
pull headlines and clocks successfully.



I haven't looked at the elisp parser much, but I do wonder if someone
couldn't write an exporter that exports a programmatic version of your
org file data (ie: to xml). Then other tools could ingest those xml
files. That'd certainly be a contrib module and not in the core, but
might be worth your while to explore the idea if you really want to
work with Org data outside of Emacs.


--
Russell Adamsrlad...@adamsinfoserv.com

PGP Key ID: 0x1160DCB3   http://www.adamsinfoserv.com/

Fingerprint:1723 D8CA 4280 1EC9 557F  66E8 1154 E018 1160 DCB3



There's the org-json (or ox-json) package but for some reason I wasn't 
able to run it successfully. I guess export to S-exps would be best 
here. But yes I'll check that out.


Przemek



Re: official orgmode parser

2020-09-15 Thread Russell Adams
On Tue, Sep 15, 2020 at 11:17:57AM +0200, Przemysław Kamiński wrote:
> > Org mode IS an elsip application. This is the main goal. The reason it
> > works so well is because elisp is largely a DSL that focuses on text
> > manipulation and is therefore ideally suited for a text based organiser.
>
> So, I keep clock times for work in org mode, this is very handy.
> However, my customers require that I use their service to provide the
> times. They do offer API. So basically I'm using elisp to parse org,
> make API calls, and at the same time generate CSV reports with a Python
> interop with org babel (because my elisp is just too bad to do
> that).

Please consider this is a very specialized use case.

> If I had access to some org parser, I'd pick a language that would
> be more comfortable for me to get the job done. I guess it can all
> be done in elisp, however this is just a tool for me alone and I
> have limited time resources on hacking things for myself :)

Maintainer time is limited too. Maintaining a parser library outside
of Emacs would be difficult for the reasons already given. I'd
encourage you to pick up some more Elisp, which I am also trying to
do.

> Anyways, my parser needs aren't that sophisticated: just parse the file,
> return headings with clock drawers. I tried the common lisp library but
> got frustrated after fiddling with it for couple of hours.

If it's that small you could always do that in Python with regexps for
your usage if you're more comfortable in Python. Org's plain text
format means you can read it with anything. I suspect grep might even
pull headlines and clocks successfully.



I haven't looked at the elisp parser much, but I do wonder if someone
couldn't write an exporter that exports a programmatic version of your
org file data (ie: to xml). Then other tools could ingest those xml
files. That'd certainly be a contrib module and not in the core, but
might be worth your while to explore the idea if you really want to
work with Org data outside of Emacs.


--
Russell Adamsrlad...@adamsinfoserv.com

PGP Key ID: 0x1160DCB3   http://www.adamsinfoserv.com/

Fingerprint:1723 D8CA 4280 1EC9 557F  66E8 1154 E018 1160 DCB3



Re: official orgmode parser

2020-09-15 Thread Przemysław Kamiński

On 9/15/20 11:03 AM, Tim Cross wrote:


Przemysław Kamiński  writes:


Hello,

I oftentimes find myself needing to parse org files with some external
tools (to generate reports for customers or sum up clock times for given
month, etc). Looking through the list

https://orgmode.org/worg/org-tools/

and having tested some of these, I must say they are lacking. The
Haskell ones seem to be done best, but then the compile overhead of
Haskell and difficulty in embedding this into other languages is a drawback.

I think it might benefit the community when such an official parser
would exist (and maybe could be hooked into org mode directly).

I was thinking picking some scheme like chicken or guile, which could be
later easily embedded into C or whatever. Then use that parser in org
mode itself. This way some important part of org mode would be outside
of the small world of elisp.

This is just an idea, what do you think? :)



The problem with this idea is maintenance. It is also partly why
external tools are not terribly reliable/good. Org mode is constantly
being enhanced and improved. It is very hard for external tools to keep
pace with org-mode development, so they soon get out of date or stop
working correctly.

Org mode IS an elsip application. This is the main goal. The reason it
works so well is because elisp is largely a DSL that focuses on text
manipulation and is therefore ideally suited for a text based organiser.

This means if you want to implement parsing of org files in any
other language, there is a lot of fundamental functionality which willl
need to be implemented that is not necessary when using elisp as it is
already built-in. Not only that, it is also 'battle hardened' and well
tested. The other problem would be in selecting another language which
behaves consistently across all the platforms Emacs and org-mode is
supported on. As org-mode is a stnadard part of Emacs, it also needs to
be implemented in something which is also available on all the platforms
emacs is on without needing the user to install additional software.

The other issue is that you would need another skill in order to
maintain/extend org-mode. In addition to elisp, you will also need to
know whatever the parser implementation language is.

A third negative is that if the parser was in a different language to
elisp, the interface between the rest of org mode (in elisp) and the
parser would become an issue. At the moment, there are far fewer
barriers as it is all elisp. However, if part of the system is in
another language, you are now restricted to whatever defined interface
exists. This would likely also have performance issues and overheads
associated with translating from one format to another etc.

So, in short, the chances of org mode using a parser written in
something other than elisp is pretty close to 0. This leaves you with 2
options -

1. Implement another external tool which can parse org-files. As
metnioned above, this is a non-trivial task and will likely be difficult
to maintain. Probably not the best first choice.

2. Provide some details about your workflow where you believe you need
to use external tools to process the org-files. It is very likely there
are alternative approaches to give you the result you want, but without
the need to do external parsing of org-files. There isn't sufficient
details in the examples you mention to provide any specific details.
However, I have used org-mode for reporting, invoicing, time tracking,
documentation, issue/request tracking, project planning and project
management and never needed to parse my org files with an external tool.
I have exported the data in different formats which have then been
processed by other tools and I have tweaked my setup to support various
enterprise/corporate standards or requirements (logos, corporate
colours, report formats, etc). Sometimes these tweaks are trivial and
others require more extensive effort. Often, others have had to do
something the same or similar and have working examples etc.

So my recommendation is post some messages to this list with details on
what you need to try and do and see what others can suggest. I would
keep each post to a single item rather than one long post with multiple
requests. From watching this list, I've often see someone post a "How
can I ..." question only to get the answer "Oh, that is already
built-in, just do .". Org is a large application with lots of
sophisticated power that isn't always obvious from just reading the
manual.




So, I keep clock times for work in org mode, this is very handy. 
However, my customers require that I use their service to provide the 
times. They do offer API. So basically I'm using elisp to parse org, 
make API calls, and at the same time generate CSV reports with a Python 
interop with org babel (because my elisp is just too bad to do that). If 
I had access to some org parser, I'd pick a language that would be more 
comfortable for me to get the job 

Re: official orgmode parser

2020-09-15 Thread Tim Cross


Przemysław Kamiński  writes:

> Hello,
>
> I oftentimes find myself needing to parse org files with some external 
> tools (to generate reports for customers or sum up clock times for given 
> month, etc). Looking through the list
>
> https://orgmode.org/worg/org-tools/
>
> and having tested some of these, I must say they are lacking. The 
> Haskell ones seem to be done best, but then the compile overhead of 
> Haskell and difficulty in embedding this into other languages is a drawback.
>
> I think it might benefit the community when such an official parser 
> would exist (and maybe could be hooked into org mode directly).
>
> I was thinking picking some scheme like chicken or guile, which could be 
> later easily embedded into C or whatever. Then use that parser in org 
> mode itself. This way some important part of org mode would be outside 
> of the small world of elisp.
>
> This is just an idea, what do you think? :)
>

The problem with this idea is maintenance. It is also partly why
external tools are not terribly reliable/good. Org mode is constantly
being enhanced and improved. It is very hard for external tools to keep
pace with org-mode development, so they soon get out of date or stop
working correctly. 

Org mode IS an elsip application. This is the main goal. The reason it
works so well is because elisp is largely a DSL that focuses on text
manipulation and is therefore ideally suited for a text based organiser. 

This means if you want to implement parsing of org files in any
other language, there is a lot of fundamental functionality which willl
need to be implemented that is not necessary when using elisp as it is
already built-in. Not only that, it is also 'battle hardened' and well
tested. The other problem would be in selecting another language which
behaves consistently across all the platforms Emacs and org-mode is
supported on. As org-mode is a stnadard part of Emacs, it also needs to
be implemented in something which is also available on all the platforms
emacs is on without needing the user to install additional software. 

The other issue is that you would need another skill in order to
maintain/extend org-mode. In addition to elisp, you will also need to
know whatever the parser implementation language is.

A third negative is that if the parser was in a different language to
elisp, the interface between the rest of org mode (in elisp) and the
parser would become an issue. At the moment, there are far fewer
barriers as it is all elisp. However, if part of the system is in
another language, you are now restricted to whatever defined interface
exists. This would likely also have performance issues and overheads
associated with translating from one format to another etc.

So, in short, the chances of org mode using a parser written in
something other than elisp is pretty close to 0. This leaves you with 2
options -

1. Implement another external tool which can parse org-files. As
metnioned above, this is a non-trivial task and will likely be difficult
to maintain. Probably not the best first choice.

2. Provide some details about your workflow where you believe you need
to use external tools to process the org-files. It is very likely there
are alternative approaches to give you the result you want, but without
the need to do external parsing of org-files. There isn't sufficient
details in the examples you mention to provide any specific details.
However, I have used org-mode for reporting, invoicing, time tracking,
documentation, issue/request tracking, project planning and project
management and never needed to parse my org files with an external tool.
I have exported the data in different formats which have then been
processed by other tools and I have tweaked my setup to support various
enterprise/corporate standards or requirements (logos, corporate
colours, report formats, etc). Sometimes these tweaks are trivial and
others require more extensive effort. Often, others have had to do
something the same or similar and have working examples etc.

So my recommendation is post some messages to this list with details on
what you need to try and do and see what others can suggest. I would
keep each post to a single item rather than one long post with multiple
requests. From watching this list, I've often see someone post a "How
can I ..." question only to get the answer "Oh, that is already
built-in, just do .". Org is a large application with lots of
sophisticated power that isn't always obvious from just reading the
manual. 




Re: official orgmode parser

2020-09-15 Thread Gerry Agbobada
Hi,

I'm currently toying with the idea of trying a tree-sitter parser for Org. The 
very static nature of a shared object parser (knowing TODO keywords are pretty 
dynamic for example) is a challenge I'm not sure to overcome ; to be honest 
even without that I can't say I'll manage to do it.

Having a tree-sitter parser would be really great in my opinion, at least it's 
a clearer way to "freeze" the syntax with some tests describing the syntax tree 
with S-expressions. And tree-sitter seems to be the popular sought after 
solution to slowness in parsing (and incremental parsing of org files would 
help with big files in my opinion)

On Tue, Sep 15, 2020, at 09:58, Przemysław Kamiński wrote:
> Hello,
> 
> I oftentimes find myself needing to parse org files with some external 
> tools (to generate reports for customers or sum up clock times for given 
> month, etc). Looking through the list
> 
> https://orgmode.org/worg/org-tools/
> 
> and having tested some of these, I must say they are lacking. The 
> Haskell ones seem to be done best, but then the compile overhead of 
> Haskell and difficulty in embedding this into other languages is a drawback.
> 
> I think it might benefit the community when such an official parser 
> would exist (and maybe could be hooked into org mode directly).
> 
> I was thinking picking some scheme like chicken or guile, which could be 
> later easily embedded into C or whatever. Then use that parser in org 
> mode itself. This way some important part of org mode would be outside 
> of the small world of elisp.
> 
> This is just an idea, what do you think? :)
> 
> Best,
> Przemek
> 
> 

Gerry Agbobada


official orgmode parser

2020-09-15 Thread Przemysław Kamiński

Hello,

I oftentimes find myself needing to parse org files with some external 
tools (to generate reports for customers or sum up clock times for given 
month, etc). Looking through the list


https://orgmode.org/worg/org-tools/

and having tested some of these, I must say they are lacking. The 
Haskell ones seem to be done best, but then the compile overhead of 
Haskell and difficulty in embedding this into other languages is a drawback.


I think it might benefit the community when such an official parser 
would exist (and maybe could be hooked into org mode directly).


I was thinking picking some scheme like chicken or guile, which could be 
later easily embedded into C or whatever. Then use that parser in org 
mode itself. This way some important part of org mode would be outside 
of the small world of elisp.


This is just an idea, what do you think? :)

Best,
Przemek