LSPRAG — Experiment Reproduction

🛠️ Setup Guide

1. Install LSPRAG from Source

Clone and build the project, then compile the extension:

cd LSPRAG
# Install dependencies
npm install

# Build the extension
npm run compile

After installing the extension, configure your language servers and LLM settings as described in the README.

Known Issues

node_modules/lru-cache/dist/commonjs/index.d.ts:1032:5 - error TS2416: Property 'forEach' in type 'LRUCache<K, V, FC>' is not assignable to the same property in base type 'Map<K, V>'.

If encountered, downgrade lru-cache:

npm install lru-cache@10.1.0

Generate Unit Test Codes by LSPRAG

Language Server Installation

Install language servers (Docker image includes these by default):
- Java: Oracle Java Extension Pack (oracle.oracle-java)
- Python: Pylance + Python (ms-python.vscode-pylance, ms-python.python)
- Go: Go extension (golang.go)
Language-specific setup examples:
Go: enable semantic tokens in settings.json:
```
{
  "gopls": {
    "ui.semanticTokens": true
  }
}
```
Java: if the Oracle Java Language Server errors on versions < 17, install Java 17 and set jdk.jdkhome accordingly. Example path on Linux: /usr/lib/jvm/java-17-openjdk-amd64.

Selecting Python interpreter in VS Code.

Example of interpreter path.

If no symbols are found, install the language server.

Option A: Download IDE Plugin

Download from VS Code Marketplace: LSPRAG.
Install and configure LLM provider in settings.json:

{
  "LSPRAG": {
    "provider": "deepseek",    // or: openai, ollama
    "model": "deepseek-chat",  // e.g., gpt-4o-mini, llama3-70b
    "openaiApiKey": "your-api-key",
    "deepseekApiKey": "your-api-key",
    "localLLMUrl": "http://your-ollama-server:port",
    "proxyUrl": "your-proxy-server"
  }
}

Open target project.
[Optional] Compile project to improve diagnostics.
Place cursor on a function and run LSPRAG: Generate Unit Test Codes.

Option B: Build from Source

Complete extension installation (see Setup Guide).
Launch in Development Mode:
- Open /LSPRAG/src/extension.ts
- Press F5 to launch Extension Development Host
Configure workspace:
- Open target project (e.g., experiments/project/black).
- Select Python interpreter for Python projects.

Reproduce Experiment Results (Table 3)

Approach

Two options are available: generate tests manually or use a pre-generated dataset.

Option A: Generate Unit Tests (Manual)

cd /LSPRAG
npm install
npm run compile

# Launch development mode
# Open /LSPRAG/src/extension.ts and press F5

# Run experiment from the command palette
# LSPRAG::Python-Experiment | LSPRAG::Java-Experiment | LSPRAG::Go-Experiment

Option B: Use Pre-generated Dataset (Recommended)

Download the dataset archive and extract to /LSPRAG/experiments:

https://drive.google.com/file/d/1labc05nmta4fhW05RoGuypk4NsoYjHf2/view

Java Projects [Commons-CLI, Commons-CSV]

Java Setup

cd /LSPRAG/scripts
wget --no-check-certificate "https://cloud.tsinghua.edu.cn/f/efade5fc56a54ee59ed1/?dl=1" -O ../javaLib.tar.gz
tar xvf ../javaLib.tar.gz

# JARs will be placed under /LSPRAG/experiments/lib

[Optional] Reproduce by Generating New Test Codes

# In VS Code: F5 to run, open project, choose model, then
Ctrl + Shift + P -> LSPRAG::Java-Experiment

If running by command-line outside debugging mode, ensure test paths are on the classpath. Example Maven snippet:

<build>
  <testSourceDirectory>src/test/java</testSourceDirectory>
  <testResources>
    <testResource>
      <directory>src/test/resources</directory>
    </testResource>
  </testResources>
  <plugins>
    <plugin>
      <groupId>org.codehaus.mojo</groupId>
      <artifactId>build-helper-maven-plugin</artifactId>
      <version>3.2.0</version>
      <executions>
        <execution>
          <id>add-test-source</id>
          <phase>generate-test-sources</phase>
          <goals>
            <goal>add-test-source</goal>
          </goals>
          <configuration>
            <sources>
              <source>src/lsprag/test/java</source>
            </sources>
          </configuration>
        </execution>
      </executions>
    </plugin>
  </plugins>
</build>

Commons-CLI Project Setup

mkdir -p /LSPRAG/experiments/projects
cd /LSPRAG/experiments/projects
git clone https://github.com/apache/commons-cli.git
cd commons-cli
mvn install -DskipTests -Drat.skip=true
mvn dependency:copy-dependencies

Commons-CSV Project Setup

mkdir -p /LSPRAG/experiments/projects
cd /LSPRAG/experiments/projects
git clone https://github.com/apache/commons-csv.git
cd commons-csv
mvn install -DskipTests -Drat.skip=true
mvn dependency:copy-dependencies

Go Projects [logrus, cobra]

Logrus Project Setup

mkdir -p /LSPRAG/experiments/projects
cd /LSPRAG/experiments/projects
git clone https://github.com/sirupsen/logrus.git
cd logrus
go env -w GOPROXY=https://goproxy.io,direct
go mod tidy

Cobra Project Setup

mkdir -p /LSPRAG/experiments/projects
cd /LSPRAG/experiments/projects
git clone https://github.com/spf13/cobra.git
cd cobra
go env -w GOPROXY=https://goproxy.io,direct
go mod tidy

Python Projects [black, crawl4ai]

Black Project Setup

mkdir -p /LSPRAG/experiments/projects
cd /LSPRAG/experiments/projects
git clone https://github.com/psf/black.git
cd black
git checkout 8dc912774e322a2cd46f691f19fb91d2237d06e2
python3 -m venv venv
source venv/bin/activate
pip install coverage pytest pytest-json-report
pip install -r docs/requirements.txt
pip install -r test_requirements.txt
pip install click mypy_extensions packaging urllib3 pathspec platformdirs
echo "version = '00.0.0'" > src/black/_black_version.py
rm pyproject.toml
# Copy dataset if using pre-generated data
cp -r /LSPRAG/experiments/data/black/* .

Crawl4ai Project Setup

mkdir -p /LSPRAG/experiments/projects
cd /LSPRAG/experiments/projects
git clone https://github.com/unclecode/crawl4ai.git
cd crawl4ai
git checkout 8878b3d032fb21ce3567b34db128bfa64687198a
python3 -m venv venv
source venv/bin/activate
pip install coverage pytest selenium pytest-json-report
pip install -r requirements.txt
cp -r /LSPRAG/experiments/data/crawl4ai/* .

Reproduce Experiment Results (Table 4)

Analyze tokens and time based on generated logs. Example command (CLI project with gpt-4o-mini):

python3 scripts/anal_cost.py \
  experiments/log-data/commons-cli/results_gpt-4o/logs/gpt-4o \
  experiments/log-data/commons-csv/results_gpt-4o/logs/gpt-4o

Repeat for Go and Python projects by pointing to their respective logs/gpt-4o folders.

Inspect Other Throughput Results

Each dataset folder contains history, logs, and results. Use the logs/gpt-4o paths as inputs to the analysis scripts.

Conclusion

If you encounter issues, please open an issue or email iejw1914@gmail.com.

Happy Testing with LSPRAG! 🎉