PinnedEssential Math & Concepts for LLM InferenceBack of the envelope calculations to estimate model’s GPU memory requirements & insights into HW/SW optimizationsMay 31May 31
PinnedThe power of Mechanical Sympathy in Software EngineeringExploring Locality of Reference, LMAX Disruptor & Flash AttentionApr 18Apr 18
PinnedPublished inGoPenAICPU & GPU — The BasicsA digestible high-level overview of what happens in The DieApr 8Apr 8
IMHO, this is not the correct benchmark."The mean processing time is 286ms for Rust versus 436ms for Java." --> On the contrary, it proves how fast LMAX Disruptor is.May 21May 21
How Understanding CPU Caches Can Supercharge Your Code: 461% Faster MatMul Case StudyThe power of Mechanical Sympathy in Software Engineering — Part 1Apr 25Apr 25
OS Error: Too many open files. Understanding file and socket descriptors.Debugging resource leakage and optimizing server configurationMar 26Mar 26
HTTP Load Balancer on Top of WSO2 Gateway — Part 1: Project Repository, Architecture and FeaturesIt’s almost four months and it has been an amazing journey! At this point, I would like to thank my mentorsIsuru Ranawaka and Kasun…Nov 13, 2022Nov 13, 2022
HTTP Load Balancer performance testIn myprevious post, I discussed Load Balancer Engine Architecture and its features. In this post, I’ll be discussing on performance…Nov 13, 2022Nov 13, 2022
GSoC — Community Bonding PeriodThis year, the community bonding period was from April 22nd — May 22nd.. Though I was communicating with my mentor and community members…Nov 13, 2022Nov 13, 2022
HTTP Load Balancer on Top of WSO2 Gateway — Part 1: Project Repository, Architecture and FeaturesIt’s almost four months and it has been an amazing journey! At this point, I would like to thank my mentors Isuru Ranawaka and Kasun…Nov 13, 2022Nov 13, 2022