Building LLM Evaluation Pipelines | Boolean & Beyond