Architecture Decision: Where Should Citrus Backend Live?
TL;DR Recommendation
Start in toml-merge, extract to tree_haver after validation
Key Discovery
Citrus::Match objects have an events array where events[0] is the rule name (Symbol).
This provides a grammar-agnostic type system!
match.events.first # => :table, :keyvalue, :string, etc.
This changes everything - we CAN build a generic Citrus backend.
Three Options Compared
Option 1: Citrus Backend Only in toml-merge
toml-merge/
lib/toml/merge/
backends/
tree_sitter.rb # Existing
citrus.rb # New - full implementation
├── match_wrapper.rb # Wraps Citrus::Match
├── parser.rb # Parsing
└── node_adapter.rb # Full Node interface
Pros:
- ✅ Fastest to implement
- ✅ Keeps tree_haver simple
- ✅ Can iterate quickly
- ✅ No cross-gem coordination
Cons:
- ❌ All Citrus logic in toml-merge
- ❌ Other *-merge gems must duplicate
- ❌ Harder to extract later
- ❌ Mixing generic + TOML-specific
Option 2: Citrus Backend Only in tree_haver
tree_haver/
lib/tree_haver/backends/
citrus/
node.rb # Generic Citrus::Match wrapper
parser.rb # Parsing
language.rb # Grammar loading
point.rb # Position calculation
toml-merge/
lib/toml/merge/
backends/
tree_sitter.rb # Uses tree_haver
citrus.rb # Thin adapter - just TOML semantics
Pros:
- ✅ Clean separation (generic vs semantic)
- ✅ Other *-merge gems can reuse
- ✅ Consistent with tree_haver design
- ✅ Promotes Citrus ecosystem
Cons:
- ❌ Unproven architecture
- ❌ More upfront complexity
- ❌ Harder to change if wrong
- ❌ Cross-gem coordination needed
Option 3: Staged Approach ⭐ RECOMMENDED
Phase 1: Build in toml-merge
toml-merge/backends/citrus/
├── Full implementation
└── Clearly marked: generic vs TOML-specific
Phase 2: Extract to tree_haver (after validation)
tree_haver/backends/citrus/
└── Generic parts moved here
toml-merge/backends/citrus/
└── Only TOML-specific parts remain
Pros:
- ✅ ✅ ✅ Low risk - validate before extraction
- ✅ Fast initial implementation
- ✅ Learn the right boundaries
- ✅ Can refine before making it generic
- ✅ Benefits of both approaches
Cons:
- More steps overall
- Temporary duplication during transition
- But: Both cons are temporary!
Decision Matrix
| Criteria | Only toml-merge | Only tree_haver | Staged |
|---|---|---|---|
| Time to first working | Fast ✅ | Slow ❌ | Fast ✅ |
| Risk of wrong abstraction | Low ✅ | High ❌ | Low ✅ |
| Reusability | None ❌ | High ✅ | High ✅ |
| Separation of concerns | Poor ❌ | Excellent ✅ | Excellent ✅ |
| Flexibility to iterate | High ✅ | Low ❌ | High ✅ |
| Long-term maintenance | Higher ❌ | Lower ✅ | Lower ✅ |
| Implementation effort | Medium | High | Medium |
Winner: Staged Approach - Best of both worlds
Implementation Plan: Staged Approach
Stage 1: Build in toml-merge (Weeks 1-2)
Goal: Get Citrus backend working, learn the patterns
# lib/toml/merge/backends/citrus.rb
module Toml::Merge::Backends
module Citrus
# Mark what's generic with comments
class MatchWrapper # GENERIC - could move to tree_haver
def initialize(match)
@match = match
end
def type
@match.events.first # Rule name
end
def start_byte
@match.offset
end
# ... etc - all generic Citrus mechanics
end
class TomlNodeAdapter # TOML-SPECIFIC - stays in toml-merge
def initialize(wrapped_match)
@wrapped = wrapped_match
end
def table?
@wrapped.type == :table
end
# ... TOML semantics
end
end
end
Deliverables:
- Working Citrus backend
- Full test coverage
- Documentation of generic vs specific
- Performance benchmarks
Stage 2: Validate & Refine (Weeks 3-4)
Goal: Use in production, find edge cases
Tasks:
- Deploy to production
- Gather metrics
- Fix bugs
- Refine boundaries
- Document extraction plan
Success Criteria:
- All tests passing
- Performance acceptable
- Clear boundary identified
- Ready to extract
Stage 3: Extract to tree_haver (Weeks 5-6)
Goal: Move generic parts to tree_haver
# tree_haver/lib/tree_haver/backends/citrus.rb
module TreeHaver::Backends
module Citrus
class Node # Extracted from toml-merge
# Generic Citrus::Match wrapper
end
class Parser
# Generic grammar loading/parsing
end
end
end
# toml-merge/lib/toml/merge/backends/citrus.rb
module Toml::Merge::Backends
module Citrus
# Now just uses tree_haver + adds TOML semantics
class Adapter
def initialize(tree_haver_node)
@node = tree_haver_node
end
def table?
@node.type == :table
end
# ... TOML-specific only
end
end
end
Deliverables:
- tree_haver gains Citrus backend
- toml-merge simplified
- All tests still passing
- Documentation updated
Stage 4: Polish & Document (Week 7)
Goal: Make it easy for others to use
Tasks:
- Write tree_haver Citrus guide
- Document grammar requirements
- Add examples
- Update READMEs
- Blog post/announcement
What Goes Where (After Extraction)
tree_haver (Generic Citrus Mechanics)
Purpose: Make ANY Citrus grammar work like tree-sitter
# Generic capabilities:
- Wrap Citrus::Match
- Extract type from events[0]
- Provide position info (bytes + points)
- Child traversal
- Capture access
- Text extraction
Example usage:
# Works with ANY Citrus grammar
language = TreeHaver::Language.from_citrus_grammar(
path: "path/to/grammar.citrus",
grammar_module: MyFormat::Document
)
parser = TreeHaver::Parser.new
parser.language = language
tree = parser.parse(source)
node = tree.root_node
node.type # => :object (from grammar rule name)
node.start_byte # => 0
node.children # => [...]
toml-merge (TOML Semantics)
Purpose: Understand TOML-specific structure
# TOML-specific knowledge:
- table rule => Table semantics
- keyvalue rule => Pair semantics
- array rule => Array semantics
- Comment handling
- Table header extraction
- Key name extraction
- Value parsing
Example usage:
analysis = Toml::Merge::FileAnalysis.new(
source,
backend: :citrus # Uses tree_haver's Citrus backend
)
node = analysis.statements.first
node.table? # => true (TOML-specific method)
node.table_name # => "section" (TOML-specific)
Risk Mitigation
Risk: “What if we extract wrong?”
Mitigation: Stage 2 validation finds issues before extraction
Risk: “What if boundaries are unclear?”
Mitigation: Clear commenting during Stage 1, refined in Stage 2
Risk: “What if no one else uses Citrus?”
Mitigation: Still valuable for toml-merge portability
Risk: “What if performance is bad?”
Mitigation: Measure in Stage 2, optimize before extraction
Success Metrics
Stage 1 Success:
- Citrus backend passes all toml-merge tests
- Performance within 2x of tree-sitter
- Clear generic/specific boundary documented
Stage 2 Success:
- Used in production without issues
- Edge cases identified and handled
- Extraction plan documented
Stage 3 Success:
- tree_haver has Citrus backend
- toml-merge code reduced
- All tests passing
- Performance maintained
Stage 4 Success:
- Documentation complete
- Examples working
- Other gems can adopt pattern
- Community feedback positive
Timeline
Week 1-2: Build in toml-merge
Week 3-4: Validate & refine
Week 5-6: Extract to tree_haver
Week 7: Polish & document
Total: ~7 weeks to complete architecture
Conclusion
Staged approach is the clear winner:
- ✅ Low risk - validate before committing
- ✅ Fast start - no cross-gem coordination needed
- ✅ Right abstractions - learn before extracting
- ✅ Long-term benefits - ends with clean architecture
- ✅ Flexibility - can stop after Stage 1 if needed
Start building the Citrus backend in toml-merge NOW.
Extract to tree_haver once we’ve learned what truly belongs there.
Next Actions
- Create
lib/toml/merge/backends/citrus/directory - Implement
MatchWrapper(generic part) - Implement
TomlNodeAdapter(specific part) - Add backend selection logic
- Write tests
- Measure performance
Let’s start with Stage 1!