Re: Implementation options for auto-complete and symbol resolution while coding

2013-12-23 Thread Colin Fleming
As another data point, Cursive's symbol resolution in normal code editing
is totally static - it doesn't use the REPL at all. When typing in the REPL
window, local resolution is used for the code in the editor and the REPL is
used for everything else, so that local symbols can be completed and so
forth but everything else comes from the running system.

I'm lucky to be building on the IntelliJ infrastructure, which provides a
fantastic architecture for static resolution out of the box. It's fully
asynchronous already and is the basic building block for all IDE
functionality. It also provides automatic indexing for various types of
indices.

Static resolution is definitely a trade-off in the presence of macros,
since in the general case it's computationally undecidable to determine
what a macro does without actually executing it. By far the biggest
downside of static resolution is this one - that until the editor is told
which symbols are declared by a particular macro form, it cannot resolve
them. This means that essentially nothing works - navigation, symbol
rename, find usages, refactorings, etc etc. I'm working on an API (which I
already use internally) which will hopefully make adding support for new
libraries relatively trivial - I plan to release that API when it's more
stable so anyone can add support for public libraries or for their internal
libs.

The advantages are pretty huge though. Cross-project resolution,
completion, and navigation are all possible whether you have the REPL
running or not. Local bindings are treated identically to global vars.
"Find usages" type functionality is possible, which apart from being an
essential navigation tool in large projects (or indeed in any project) is
also the basis of many refactorings. The indices backing all this are
transparently updated continuously in the background and the editor is very
responsive, even when editing clojure/core.clj.

Only time will tell if this approach is better or worse than REPL-based
resolution, but for the moment I'm very happy with the tradeoff. The
infrastructure is very complicated to implement, though, it's more stable
now but for a long time it took most of my development time and I'm still
planning a few more fairly major changes as I encounter more crazy things
people do with macros.

Cheers,
Colin


On 24 December 2013 04:49, Juan Martín  wrote:

> *John:* I had watched that talk a while ago, not sure how I got to it.
> The work he describes is really interesting and sounds quite herculean,
> something that a company like Google can do. Unluckily the project hasn't
> seen the public light of day yet, at least not that I know of.
>
> *Zack*: Thanks for sharing the library NightCode uses for auto-completion
> and docstrings, I will certainly take a look at it.
>
> *Tim*: This approach certainly doesn't sound too costly. I had thought of
> a similar one but didn't get to implement it yet, the fact that you
> suggested it and that Laurent's response mentions multiple sources as well,
> makes me think that this is the way to go.
>
> *Laurent*: I've used CounterClockwise and I think it's one of the
> strongest development tools for Clojure out there. Decoupling the
> producer(s) from the consumer(s) of the symbol dictionary seems like a good
> idea. Large files are certainly an issue when dealing with this subject,
> since the whole file needs to be parsed and processed, even while it is
> being edited.
>
> I think have enough for a little more hammock time and an implementation
> attempt. My partial conclusion so far (which may not be spot on so feel
> free to contribute) is that since this is a really hard problem, it is
> reasonable to expect sub-optimal (but usable in almost all scenarios)
> symbol resolution results from any tool, which means to me that there is
> always room for improvement and this actually makes the problem interesting
> :)
>
> Thanks to all those who replied, all the comments and thoughts you shared
> have been really useful!
>
> Juan
>
>
>  On Mon, Dec 23, 2013 at 10:01 AM, Laurent PETIT 
> wrote:
>
>>  Hello,
>>
>> Food for thought:
>>
>>
>> Currently Counterclockwise does 2 things:
>>
>> - it has an up-to-date list of symbols / keywords derived from the
>> current editor. This of course does not need a running REPL, works as an
>> heuristic for locals, and that's all. It won't go beyond the current file,
>> won't show docstring or arglist of vars.
>> - it tries to use the last active REPL View, if there's one, and use it
>> for either code completion or symbol resolution+metadata (for showing
>> docstring, hyperlinks to other parts of source code).
>>
>> Both these approaches rely on synchronized calls:
>> - in the first case, it asks for the parse tree synchronously. Since
>> Counterclockwise uses Parsley, which is an incremental parser, it works
>> well 99% of the time. But there's still this 1% where you work with a big
>> file, e.g. clojure/core.clj, and you can feel the editor lag behind y

Re: Implementation options for auto-complete and symbol resolution while coding

2013-12-23 Thread Juan Martín
*John:* I had watched that talk a while ago, not sure how I got to it. The
work he describes is really interesting and sounds quite herculean,
something that a company like Google can do. Unluckily the project hasn't
seen the public light of day yet, at least not that I know of.

*Zack*: Thanks for sharing the library NightCode uses for auto-completion
and docstrings, I will certainly take a look at it.

*Tim*: This approach certainly doesn't sound too costly. I had thought of a
similar one but didn't get to implement it yet, the fact that you suggested
it and that Laurent's response mentions multiple sources as well, makes me
think that this is the way to go.

*Laurent*: I've used CounterClockwise and I think it's one of the
strongest development tools for Clojure out there. Decoupling the
producer(s) from the consumer(s) of the symbol dictionary seems like a good
idea. Large files are certainly an issue when dealing with this subject,
since the whole file needs to be parsed and processed, even while it is
being edited.

I think have enough for a little more hammock time and an implementation
attempt. My partial conclusion so far (which may not be spot on so feel
free to contribute) is that since this is a really hard problem, it is
reasonable to expect sub-optimal (but usable in almost all scenarios)
symbol resolution results from any tool, which means to me that there is
always room for improvement and this actually makes the problem interesting
:)

Thanks to all those who replied, all the comments and thoughts you shared
have been really useful!

Juan


On Mon, Dec 23, 2013 at 10:01 AM, Laurent PETIT wrote:

> Hello,
>
> Food for thought:
>
>
> Currently Counterclockwise does 2 things:
>
> - it has an up-to-date list of symbols / keywords derived from the current
> editor. This of course does not need a running REPL, works as an heuristic
> for locals, and that's all. It won't go beyond the current file, won't show
> docstring or arglist of vars.
> - it tries to use the last active REPL View, if there's one, and use it
> for either code completion or symbol resolution+metadata (for showing
> docstring, hyperlinks to other parts of source code).
>
> Both these approaches rely on synchronized calls:
> - in the first case, it asks for the parse tree synchronously. Since
> Counterclockwise uses Parsley, which is an incremental parser, it works
> well 99% of the time. But there's still this 1% where you work with a big
> file, e.g. clojure/core.clj, and you can feel the editor lag behind you.
> - in the second case, any lag/problem with the network layer can affect
> your typing experience. This has been reported to me in the scariest way by
> a user the previous week: a corner case where the out of the box nrepl
> client will just hang forever because the remote connection was lost.
>
> So I'm thinking more and more these days about another design: totally
> decoupling the gathering of "symbols dictionary" from the usage of this
> dictionary. A true temporal decoupling.
> This means that the editor will never feel sluggish again. Maybe the
> information presented will be a little bit out of date, in a few
> percentage, but that would generally be for the greater good.
>
> My idea so far will be to :
>
> - have an atom on the Editor side containing a symbols dictionary. Updates
> to this dictionary will be done by background threads based on various
> events ( manual text change(s) to the editor - static analysis -, user
> interaction with a REPL - dynamic gathering of namespaces+vars -, updates
> of the project classpath - static analysis of jar dependencies - ).
> - This will allow Counterclockwise to have an always responsive editor.
> Only background threads may be blocked by problematic parses, problematic
> nrepl connections, etc.
> - This temporal decoupling also neatly decouples the production from the
> consumption of the "symbols dictionary". This will be an overall better
> design to enable additional contributions to the "symbols dictionary"
> without direct impact on the consumers.
>
>
> So this is going a little bit agains the grain of what people are doing
> currently by overloading the server-side of things with knowledge, but I
> think it's the right direction to go, and the one I'll experiment with in
> the next weeks.
>
>
> Cheers,
>
> --
> Laurent
>
>
>
>
> 2013/12/18 juan.facorro 
>
>> Hi Clojurers,
>>
>> I'm building a tool for Clojure and I've been hitting the same bump for
>> quite some time now, namely auto-completion and finding the definition of a
>> symbol. After doing some research I've found that some tools rely on a
>> running REPL to figure out where a symbol might be coming from; these
>> include emacs [1], Counter-Clockwise, clooj and maybe others I don't know
>> about (like Nightcode or Cursive). This seems the natural thing to do since
>> while developing we always have a REPL running to try out what we code,
>> after all this is one of the best LISP features. This approa

Re: Implementation options for auto-complete and symbol resolution while coding

2013-12-23 Thread Laurent PETIT
Hello,

Food for thought:


Currently Counterclockwise does 2 things:

- it has an up-to-date list of symbols / keywords derived from the current
editor. This of course does not need a running REPL, works as an heuristic
for locals, and that's all. It won't go beyond the current file, won't show
docstring or arglist of vars.
- it tries to use the last active REPL View, if there's one, and use it for
either code completion or symbol resolution+metadata (for showing
docstring, hyperlinks to other parts of source code).

Both these approaches rely on synchronized calls:
- in the first case, it asks for the parse tree synchronously. Since
Counterclockwise uses Parsley, which is an incremental parser, it works
well 99% of the time. But there's still this 1% where you work with a big
file, e.g. clojure/core.clj, and you can feel the editor lag behind you.
- in the second case, any lag/problem with the network layer can affect
your typing experience. This has been reported to me in the scariest way by
a user the previous week: a corner case where the out of the box nrepl
client will just hang forever because the remote connection was lost.

So I'm thinking more and more these days about another design: totally
decoupling the gathering of "symbols dictionary" from the usage of this
dictionary. A true temporal decoupling.
This means that the editor will never feel sluggish again. Maybe the
information presented will be a little bit out of date, in a few
percentage, but that would generally be for the greater good.

My idea so far will be to :

- have an atom on the Editor side containing a symbols dictionary. Updates
to this dictionary will be done by background threads based on various
events ( manual text change(s) to the editor - static analysis -, user
interaction with a REPL - dynamic gathering of namespaces+vars -, updates
of the project classpath - static analysis of jar dependencies - ).
- This will allow Counterclockwise to have an always responsive editor.
Only background threads may be blocked by problematic parses, problematic
nrepl connections, etc.
- This temporal decoupling also neatly decouples the production from the
consumption of the "symbols dictionary". This will be an overall better
design to enable additional contributions to the "symbols dictionary"
without direct impact on the consumers.


So this is going a little bit agains the grain of what people are doing
currently by overloading the server-side of things with knowledge, but I
think it's the right direction to go, and the one I'll experiment with in
the next weeks.


Cheers,

-- 
Laurent




2013/12/18 juan.facorro 

> Hi Clojurers,
>
> I'm building a tool for Clojure and I've been hitting the same bump for
> quite some time now, namely auto-completion and finding the definition of a
> symbol. After doing some research I've found that some tools rely on a
> running REPL to figure out where a symbol might be coming from; these
> include emacs [1], Counter-Clockwise, clooj and maybe others I don't know
> about (like Nightcode or Cursive). This seems the natural thing to do since
> while developing we always have a REPL running to try out what we code,
> after all this is one of the best LISP features. This approach results in
> very accurate locations for global symbol definitions, but locals are not
> found since they are not accesible form the REPL.
>
> Another approach I've seen used for auto-completion in Clojure is the
> token-based, which involves looking for tokens in the code base associated
> with the current project and then providing the nearest match regardless of
> context; these include J Editor [2], Light Table (which I think uses
>  inter-buffer token matching [3]) and emacs when it uses dictionary files
> (maybe not specifically in existing Clojure modes but it's something that
> emacs can do). Although this approach resolves the auto-completion, it is
> not very accurate when locating symbol definitions.
>
> From what I've read this is not a trivial problem so I was wondering if
> there's some implementation that actually resolves symbols statically (I
> mean without having a running REPL) in an accurate way or, if there's no
> implementation, maybe someone could point me in the right direction (or any
> direction) as to what would ease the pain to accomplish such a task.
> Building something on my own to do this "static symbol resolution" is out
> of the question, since that sounds like a whole project on its own and I'm
> currently trying to build something else entirely.
>
> There are parsing libraries which provide good parse trees (i.e. Parsley,
> Instaparse), but my understanding is that what needs to be mantained is a
> full abstract syntax tree for the whole code base and although
> clojure.tools.analyzer [4] does the job of creating an AST, generating and
> mantaining all these trees sounds very costly and not the right way to do
> it.
>
> If the running REPL approach is the saner one, then I would have no
> pr

Re: Implementation options for auto-complete and symbol resolution while coding

2013-12-23 Thread Robert Ewald
"juan.facorro"  writes:

> Hi Clojurers,
>
[snip]
>
> There are parsing libraries which provide good parse trees (i.e. Parsley, 
> Instaparse), but my understanding is that what needs to be
> mantained is a full abstract syntax tree for the whole code base and although 
> clojure.tools.analyzer [4] does the job of creating an
> AST, generating and mantaining all these trees sounds very costly and not the 
> right way to do it.

Just an idea. Maybe you should just use the parse tree for locals and
the repl for globals. That shouldn't be too costly.

-- 
-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Implementation options for auto-complete and symbol resolution while coding

2013-12-20 Thread Zach Oakes
Nightcode uses Compliment for providing completion suggestions and 
documentation of Clojure functions:

https://github.com/alexander-yakushev/compliment

On Wednesday, December 18, 2013 11:27:06 AM UTC-5, juan.facorro wrote:
>
> Hi Clojurers,
>
> I'm building a tool for Clojure and I've been hitting the same bump for 
> quite some time now, namely auto-completion and finding the definition of a 
> symbol. After doing some research I've found that some tools rely on a 
> running REPL to figure out where a symbol might be coming from; these 
> include emacs [1], Counter-Clockwise, clooj and maybe others I don't know 
> about (like Nightcode or Cursive). This seems the natural thing to do since 
> while developing we always have a REPL running to try out what we code, 
> after all this is one of the best LISP features. This approach results in 
> very accurate locations for global symbol definitions, but locals are not 
> found since they are not accesible form the REPL.
>
> Another approach I've seen used for auto-completion in Clojure is the 
> token-based, which involves looking for tokens in the code base associated 
> with the current project and then providing the nearest match regardless of 
> context; these include J Editor [2], Light Table (which I think uses 
>  inter-buffer token matching [3]) and emacs when it uses dictionary files 
> (maybe not specifically in existing Clojure modes but it's something that 
> emacs can do). Although this approach resolves the auto-completion, it is 
> not very accurate when locating symbol definitions.
>
> From what I've read this is not a trivial problem so I was wondering if 
> there's some implementation that actually resolves symbols statically (I 
> mean without having a running REPL) in an accurate way or, if there's no 
> implementation, maybe someone could point me in the right direction (or any 
> direction) as to what would ease the pain to accomplish such a task. 
> Building something on my own to do this "static symbol resolution" is out 
> of the question, since that sounds like a whole project on its own and I'm 
> currently trying to build something else entirely. 
>
> There are parsing libraries which provide good parse trees (i.e. Parsley, 
> Instaparse), but my understanding is that what needs to be mantained is a 
> full abstract syntax tree for the whole code base and although 
> clojure.tools.analyzer [4] does the job of creating an AST, generating and 
> mantaining all these trees sounds very costly and not the right way to do 
> it.
>
> If the running REPL approach is the saner one, then I would have no 
> problem with going down that road, but I just wanted to make sure what the 
> viable options were.
>
> If you got this far, thank you for your time. :)
>
> Any help, thoughts or comments will be greatly appreciated!
>
> Juan
>
> [1] https://github.com/clojure-emacs/ac-nrepl
> [2] http://armedbear-j.sourceforge.net/
> [3] 
> https://groups.google.com/forum/#!msg/light-table-discussion/Q-ZvOJSr1qo/-D6tAV_XiMUJ
> [4] https://github.com/clojure/tools.analyzer
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Implementation options for auto-complete and symbol resolution while coding

2013-12-18 Thread John Wiseman
Just for background, Steve Yegge's grok project seems relevant.  It is a
cross-language static analysis system intended to be useable on a large
scale.  (And is intended to be open sourced, when it's done.)
http://www.youtube.com/watch?v=KTJs-0EInW8


On Wed, Dec 18, 2013 at 8:27 AM, juan.facorro wrote:

> Hi Clojurers,
>
> I'm building a tool for Clojure and I've been hitting the same bump for
> quite some time now, namely auto-completion and finding the definition of a
> symbol. After doing some research I've found that some tools rely on a
> running REPL to figure out where a symbol might be coming from; these
> include emacs [1], Counter-Clockwise, clooj and maybe others I don't know
> about (like Nightcode or Cursive). This seems the natural thing to do since
> while developing we always have a REPL running to try out what we code,
> after all this is one of the best LISP features. This approach results in
> very accurate locations for global symbol definitions, but locals are not
> found since they are not accesible form the REPL.
>
> Another approach I've seen used for auto-completion in Clojure is the
> token-based, which involves looking for tokens in the code base associated
> with the current project and then providing the nearest match regardless of
> context; these include J Editor [2], Light Table (which I think uses
>  inter-buffer token matching [3]) and emacs when it uses dictionary files
> (maybe not specifically in existing Clojure modes but it's something that
> emacs can do). Although this approach resolves the auto-completion, it is
> not very accurate when locating symbol definitions.
>
> From what I've read this is not a trivial problem so I was wondering if
> there's some implementation that actually resolves symbols statically (I
> mean without having a running REPL) in an accurate way or, if there's no
> implementation, maybe someone could point me in the right direction (or any
> direction) as to what would ease the pain to accomplish such a task.
> Building something on my own to do this "static symbol resolution" is out
> of the question, since that sounds like a whole project on its own and I'm
> currently trying to build something else entirely.
>
> There are parsing libraries which provide good parse trees (i.e. Parsley,
> Instaparse), but my understanding is that what needs to be mantained is a
> full abstract syntax tree for the whole code base and although
> clojure.tools.analyzer [4] does the job of creating an AST, generating and
> mantaining all these trees sounds very costly and not the right way to do
> it.
>
> If the running REPL approach is the saner one, then I would have no
> problem with going down that road, but I just wanted to make sure what the
> viable options were.
>
> If you got this far, thank you for your time. :)
>
> Any help, thoughts or comments will be greatly appreciated!
>
> Juan
>
> [1] https://github.com/clojure-emacs/ac-nrepl
> [2] http://armedbear-j.sourceforge.net/
> [3]
> https://groups.google.com/forum/#!msg/light-table-discussion/Q-ZvOJSr1qo/-D6tAV_XiMUJ
> [4] https://github.com/clojure/tools.analyzer
>
> --
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to clojure+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Implementation options for auto-complete and symbol resolution while coding

2013-12-18 Thread juan.facorro
Hi Clojurers,

I'm building a tool for Clojure and I've been hitting the same bump for 
quite some time now, namely auto-completion and finding the definition of a 
symbol. After doing some research I've found that some tools rely on a 
running REPL to figure out where a symbol might be coming from; these 
include emacs [1], Counter-Clockwise, clooj and maybe others I don't know 
about (like Nightcode or Cursive). This seems the natural thing to do since 
while developing we always have a REPL running to try out what we code, 
after all this is one of the best LISP features. This approach results in 
very accurate locations for global symbol definitions, but locals are not 
found since they are not accesible form the REPL.

Another approach I've seen used for auto-completion in Clojure is the 
token-based, which involves looking for tokens in the code base associated 
with the current project and then providing the nearest match regardless of 
context; these include J Editor [2], Light Table (which I think uses 
 inter-buffer token matching [3]) and emacs when it uses dictionary files 
(maybe not specifically in existing Clojure modes but it's something that 
emacs can do). Although this approach resolves the auto-completion, it is 
not very accurate when locating symbol definitions.

>From what I've read this is not a trivial problem so I was wondering if 
there's some implementation that actually resolves symbols statically (I 
mean without having a running REPL) in an accurate way or, if there's no 
implementation, maybe someone could point me in the right direction (or any 
direction) as to what would ease the pain to accomplish such a task. 
Building something on my own to do this "static symbol resolution" is out 
of the question, since that sounds like a whole project on its own and I'm 
currently trying to build something else entirely. 

There are parsing libraries which provide good parse trees (i.e. Parsley, 
Instaparse), but my understanding is that what needs to be mantained is a 
full abstract syntax tree for the whole code base and although 
clojure.tools.analyzer [4] does the job of creating an AST, generating and 
mantaining all these trees sounds very costly and not the right way to do 
it.

If the running REPL approach is the saner one, then I would have no problem 
with going down that road, but I just wanted to make sure what the viable 
options were.

If you got this far, thank you for your time. :)

Any help, thoughts or comments will be greatly appreciated!

Juan

[1] https://github.com/clojure-emacs/ac-nrepl
[2] http://armedbear-j.sourceforge.net/
[3] 
https://groups.google.com/forum/#!msg/light-table-discussion/Q-ZvOJSr1qo/-D6tAV_XiMUJ
[4] https://github.com/clojure/tools.analyzer

-- 
-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.